Processador Curiosidades de hardware

Nemesis11 · 30 de Março de 2022

O SOC da 2ª Board parece ser o mais interessante. Ele tem a marcação "VIA John":

Este "VIA John", parece que era a 3ª Geração dos "Via Corefusion". SOCs x86 muito integrados, low power. Entretanto, parece que foi cancelado:

Little is known about the Corefusion John, other than a VIA roadmap PDF from 2005 found here. It was apparently planned as a successor to the Luke, and supposed planned features were to include DDR2 support, hardware WMV9 decoding and HD audio support. It is unknown if this part was ever released, but it likely would have used some variant of a VIA C7 core (since VIA lost licenses required to produce the C3 series.)

https://en.wikipedia.org/wiki/VIA_CoreFusion#John

Foto do "VIA Mark" (1ª geração):

Foto do "VIA Luke" (2ª geração):

JPgod · 30 de Março de 2022

N3RO disse:
Não me recordava desta thread, vou partilhando umas coisas que me passem pelas mãos.

Meanwhile, do passado, alguém se lembra disto?

https://imgur.com/a/hgyP7ev

Lembro bem, nunca tive foi um

Dark Kaeser · 5 de Abril de 2022

ACES ‘Composable’ Supercomputer Gets Ready for Phase One Use

ACES will have a variety of nodes with a range of processors – CPUs, GPUs, FPGAs, specialized AI processors – that can be dynamically mixed and matched as needed for particular workflows.

Next generation infrastructure will be dynamically configurable, he said. “In ACES, each server can dynamically pool CPUs or GPUs or storage from the resource pool using the composable fabric and software. Each server is dynamically configurable based on the workload of your research team. We plan to deploy the Dell Omnia software which will support both Slurm and Kubernetes schedulers and they will be integrated with the Liqid software and fabric.”

Besides highlighting the different accelerator types available, he noted, “If you need large memory you can dynamically compose up to three terabytes of the memory and then you can run your application that needs[such] a large memory.”

About the ACES software environment, Liu said, “ACES will host all major HPC AI/ML software and frameworks. We will support the most widely-used the recent application software, with support to JupyterHub. We plan to offer Intel’s oneAPI as the cross-architecture programming framework or CPUs, GPUs and FPGAs. A user can use the same code through oneAPI to run on CPUs, GPUs, and FPGAs.

“We will support Slurm and Kubernetes. We will use Anaconda, Easybuild to build your software, and we will provide Singularity and Charlie Cloud for container applications. On the system side, we will install the XSEDE software stack and also support the HTC Condor.” (see slide below)

https://www.hpcwire.com/2022/04/04/aces-composable-supercomputer-gets-ready-for-phase-one-use/

Nemesis11 · 23 de Abril de 2022

Isto dava jeito em muitas situações.

BGA Socket para RAM:

https://www.cnx-software.com/2022/04/21/bga-socket-allows-ram-upgrades-on-sbcs/

https://twitter.com/i/web/status/1516225559214583808

Dark Kaeser · 23 de Abril de 2022

We got a Microsoft Xbox XDK (Devkit) for the Series X console, but because it was banned, our only content option was to do a tear-down. We discovered a huge amount of RAM!

Screenshot-2022-04-23-at-14-34-12-Microsoft-Banned-Us-40-GB-RAM-Xbox-Series-X-Developer-Kit-Tear-Down.png

Screenshot-2022-04-23-at-14-48-24-Microsoft-Banned-Us-40-GB-RAM-Xbox-Series-X-Developer-Kit-Tear-Down.png

Screenshot-2022-04-23-at-14-48-50-Microsoft-Banned-Us-40-GB-RAM-Xbox-Series-X-Developer-Kit-Tear-Down.png

Dark Kaeser · 26 de Abril de 2022

Mais um "prcessador" desenvolvido "à medida" para uma utilização na área científica

Oil And Gas Industry To Get Its Own Stencil Tensor Accelerator

Franz-Josef Pfreundt, team lead for next-generation architectures for the Fraunhofer Institute for Industrial Mathematics, has seen how compute engines fail in the market, and recalls IBM’s “Cell” processor, which launched in 2007 and which was used by the institute. IBM abandoned the chip, which paired a brawny Power4 core and eight specialized vector engines, in 2010. (The Cell chips famously did most of the compute in the petaflops-busting “Roadrunner” supercomputer at Los Alamos National Laboratory.)

In the wake of Cell’s demise, Pfreundt partnered with Jens Kruger at Fraunhofer and Marty Deneroff at Berkeley National Laboratory to develop the GreenWave chip, a processor aimed at reverse time migration (RTM) workloads common in the oil and gas industry and also at accelerating climate simulations. The initial GreenWave chip was based on Tensilica cores, and the 28 nanometer design included 700 cores, 32 chips per node, an in-order core with scratchpad memory and solid performance. The Greenwave chip also came out in 2013 – too early for the industry and, therefore, with no money to produce it.

In 2016, Pfreundt and Kruger embarked on creating a new a new chip, teaming with ETH Zurich to design an accelerator that could meet the performance and efficiency demands for HPC and machine learning workloads.

The result is the STX compute engine – that STX is short for stencil and tensor accelerator, which is part of the growing trend toward domain-specific processing – that tries to balance the conflicting demands for high power efficiency, easy programmability, and low costs. The STX chip was designed as part of the larger European Processor Initiative (EPI) that is driving the push for European independence in HPC and, eventually, exascale computing by relying more on EU-developed technologies.

The STX is design to execute “mass kernels on volume points,” Pfreundt says. “Every time you have a constant access pattern, which is any type of stencil with iterative calculation, that will work. One, for example, is wind energy. In the EPI, one side is this Arm processor which is being developed. Then there are the Spanish guys doing a RISC-V vector processor and we have this stencil processing unit, which in the first place is a specialized accelerator, but it will be general programing. This is the important point: It’s developed from a scientist’s view, so that makes programing easier and keeps the general programmability.”

The design includes a stenciling processing unit, or SPU, a small VLIW architecture with some key parts, including the address generation unit.

“When you imagine this large stencil, you have to do a lot of advanced calculations to get the points in memory where you have to get the data from,” Pfreundt says. “This is done in hardware. Since we have a constant access pattern within the loop, that means we can even put the looping in hardware. You run your loop index in hardware. This makes programing a lot easier. Sure, there are some ideas how to organize the data better. But this is the main thing and we have this scratch pad memory on the SPU as well.”

Having the address generation hardware reduces the overhead of chips that do so in software, Pfreundt says. It’s needed because users have a constant access pattern that isn’t modified and is always the same, so it makes sense to do it in hardware, similar to how it’s done in FPGAs.

There also are two floating point units (FPUs) that can be 32-bit or 64-bit and run four operations per cyle as well as the TCDM scratch pad memory. There are four SPUs – with a RISC-V management core – per cluster and several clusters per microtile and 32 microtiles per chiplet, along with scratch pad memory and a low-latency interconnect.

As for coding, developers can code formula as they would in C, C++, or Fortran, and the STX will work with that. That feeds into the push to make programming easier by avoid large porting and tuning efforts, he says.

“The compiler is really an essential piece and we spent a lot of time in the compiler technology,” Pfreundt says. “The compiler and the architecture are developed together so that the compiler guys really have an inference how the chip looks like and the other way around. Therefore, the code itself is simple enough that the compiler can do the job. It’s called a very large structural architecture, but actually the instruction was very short. There’s no vectorization that needs to be done at all, which makes life a lot simpler for a compiler. We have OpenMP support. It’s just one OpenMP call and code will compile. If we talk about the host, the host is a quad-core RISC-V core.”

The design of the compiler enabled the developers to reach 80 percent of the theoretical peak performance, he says. It’s based on the LLVM architecture.

None of this was done on final hardware; instead, a software simulator was used to run tests of the SPU on a range of workloads, from RTM and fast-fourier transform (FFT) to convolutional neural networks and average pooling.

“What we see is that you have now many simulations which are combined with machine learning,” Pfreundt says. “They say in the fluid dynamics model, you have a machine learning that describes the turbulence model and so you’re very often have this combination now of machine learning with simulation and so we can do both in an optimal way.”

Right now, the hardware setup is four SPUs per cluster and 128 clusters per chiplet, with clock speeds of 1.1 GHz, a 12 nanometer FinFET design, and 16 GB of HBM2e memory, with PCI-Express 5.0 CXL 2.0 support, and consuming 35 watts. It is manufactured by Globalfoundries, so it can be built in Europe, and Pfreundt says the STX chip helps drive down the cost by a factor of five what a GPU costs, he says.

https://www.nextplatform.com/2022/0...ry-to-get-its-own-stencil-tensor-accelerator/

Nemesis11 · 29 de Abril de 2022

Nada como ter uma máquina de jogos com um Quad Socket Opteron 64 Cores. 250 W em idle, 800 W em load e a performance multithread de um i5-12600K.

Nemesis11 · 9 de Maio de 2022

Não é recente e já era conhecida, mas é fora do habitual. Intel VCA2.

Isto são 3 computadores numa placa Pci-Express.
São 3 Xeons E-1585L v5 Quad Core, cada um com 2 SODIMMs. A ligação passa por um Switch Pci-Ex na própria placa.
Não tem Storage. O host vê cada computador nesta placa, numa rede (a correr em cima de Pci-Ex) e envia uma imagem do Sistema Operativo (Centos), por essa rede e fica a correr em RAM.
Usada para Video Enconding com o Quicksync via iGPU. Suporta 44 Streams simultâneos h264 1080p ou 14 Streams simultâneos h264 4K.

Video do der8auer:

Nemesis11 · 13 de Maio de 2022

Apenas um Chip com mais de 10000 pins.

É um Chip da Intel para Switchs, que suporta 64 portas a 400 Gbits cada.

Ao lado do chip da geração anterior.

https://www.servethehome.com/massive-64x-400gbe-intel-tofino-3-switch-chip-at-vision-2022/

Dark Kaeser · 14 de Maio de 2022

Também lançaram entretanto o Gaudi 2, já a 7nm @tsmc

Intel Habana Gaudi2 Launched for Lower-Cost AI Training

Intel-Habana-Gaudi-2-High-Level-Block-Diagram-696x431.jpg

https://www.servethehome.com/intel-habana-gaudi2-launched-ai-training-chip-supermicro-ddn-oam/

Nemesis11 · 1 de Junho de 2022

Um computador para Formigas.

https://www.cnx-software.com/2022/0...s-with-0-42-inch-display-qwicc-i2c-connector/

Na verdade é uma SBC com o CPU Raspberry Pi RP2040 e um ecrã OLED de 0,42 polegadas.

Espero que alguém tente portar o Doom para ele.

godevskii · 3 de Junho de 2022

@Nemesis11 acho que isso é SBM, porque não corre OS.

Nemesis11 · 3 de Junho de 2022

@godevskii Sim, tens razão, ele usa um Microcontrolador (O Raspberry Pi RP2040) e não tem os interfaces comuns de um computador. Ter-lhe chamado SBM teria sido mais correcto que SBC.

Quanto a não correr um OS, ele corre um OS. Não é um OS genérico.

De notar que é possível ter um computador genérico a partir de um Microcontrolador.

Dark Kaeser · 8 de Junho de 2022

https://twitter.com/i/web/status/1534011294697033728

Nemesis11 · 8 de Junho de 2022

Está dentro de uma peça de Lego?

Continuando nesta onda de dispositivos muito pequenos, lembrei-me dos "Smart Rings". Basicamente, um anel com um microcontrolador, bateria e outros componentes integrados.

Um vídeo com o Teardown

Aquele Infineon CY8C6336BZI é um microcontrolador com um processador ARM M4.

Dark Kaeser · 9 de Junho de 2022

Quer dizer... fez uma impressão 3D

https://twitter.com/i/web/status/1534011660369035264

https://twitter.com/i/web/status/1534645930342744064

JPgod · 9 de Junho de 2022

É insane o quão pequeno é as coisas!

Nemesis11 · 9 de Junho de 2022

Dark Kaeser disse:
Quer dizer... fez uma impressão 3D

https://twitter.com/i/web/status/1534011660369035264

https://twitter.com/i/web/status/1534645930342744064

Muito bom.

Continuando nisto dos tamanhos e com algo mais comum, a motherboard do portátil Dell XPS 13 de 2022:

E a comparação do espaço que ocupa, entre a versão 2021, à esquerda e 2022, à direita (O portátil, em si, já é pequeno, visto ser de 13 polegadas):

https://www.tomshardware.com/news/dell-xps-13-price-specs-release-date-2-in-1

Naquela motherboard está o CPU, Gráfica integrada, RAM (LPDDR5 package-on-package), SSD, Wireless, etc.
O ponto contra é que não é "upgradable", nem sequer outros componentes fora da motherboard, como a bateria.

Dark Kaeser · 19 de Junho de 2022

https://twitter.com/i/web/status/1527250534410948608

Dark Kaeser · 21 de Junho de 2022

https://twitter.com/i/web/status/1538877502642429952

https://twitter.com/i/web/status/1538919892115673088

há um vídeo...

https://youtu.be/he6xyl_MHXY

Processador Curiosidades de hardware

Power Member

Moderador

Colaborador

​

ACES ‘Composable’ Supercomputer Gets Ready for Phase One Use​

Power Member

Colaborador

Colaborador

Oil And Gas Industry To Get Its Own Stencil Tensor Accelerator​

Power Member

Power Member

Power Member

Colaborador

Intel Habana Gaudi2 Launched for Lower-Cost AI Training​

Power Member

Power Member

Power Member

Colaborador

Power Member

Colaborador

Moderador

Power Member

Colaborador

Colaborador

ACES ‘Composable’ Supercomputer Gets Ready for Phase One Use

Oil And Gas Industry To Get Its Own Stencil Tensor Accelerator

Intel Habana Gaudi2 Launched for Lower-Cost AI Training