cL!cK
Power Member
PcPerspective disse:What is the HYDRA Engine?
At its most basic level the HYDRA Engine is an attempt to build a completely GPU-independent graphics scaling technology - imagine having NVIDIA graphics cards from the GeForce 6600 to the GTX 280 working together with little to no software overhead with nearly linear performance scaling. HYDRA uses both software and hardware designed by Lucid to improve gaming performance seamlessly to the application and graphics cards themselves and uses dedicated hardware logic to balance graphics information between the CPU and GPUs.
Why does Lucid feel the traditional methods that NVIDIA and AMD/ATI have been implementing are not up to the challenge? The two primary multi-GPU rendering modes that both companies use are split frame rendering and alternate frame rendering. Lucid challenges that both have significant pitfalls that their HYDRA Engine technology can correct. For split frame rendering the down side is the need for all GPUs to replicate ALL the texture and geometry data and thus memory bandwidth and geometry shader limitations of a single GPU remain. For alternate frame rendering the drawback is latency introduced by alternating frames between X GPUs and latency required for inter-frame dependency resolution.
How it Works
HYDRA is a dedicated silicon with sole purpose of scaling GPUs. Though there is no graphics processing logic on the HYDRA chip, what the chip can do is redistribute graphics workloads across multiple GPUs in real-time. The HYDRA technology also includes a unique software driver that rests between the DirectX architecture and the GPU vendor driver.
The distribution engine as it is called is responsible for reading the information passed from the game or application to DirectX before it gets to the NVIDIA or AMD drivers. There the engine breaks up the various blocks of information into "tasks" - a task is a specific job that HYDRA defines that can be passed to any of the 2-4 GPUs in the system. A task might be something like a specific lighting effect, a post processing run, a specific model being drawn, etc. The company founders on hand at the meeting were a little vague about the algorithms that decide how, and what parts, of the DirectX data are going to be defined as "tasks" - it is obvious that this is part of the magic that gives HYDRA its power; it is with these task definitions that the hardware logic can efficiently distribute the work load across many GPUs.
Once the tasks have been created, they are then sent over the PCI Express bus to the HYDRA chip where they are VERY quickly processed and split between 2 to 4 GPUs. The HYDRA Engine passes off these tasks to the GPU, awaits a result and return of finished data or pixels, and is then responsible for passing that information on to one of the GPUs for final output to a monitor. At the outset, this doesn't sound that much different than what NVIDIA and AMD already do with AFR and SFR rendering modes, but after seeing the HYDRA technology at work it is obviously something very different.
By essentially intercepting the DirectX calls from the game to the graphics cards, the HYDRA Engine is able to intelligently break up the rendering workload rather than just "brute-forcing" alternate frames or split frames as both GPU vendors are doing today in SLI and CrossFire. And according to Lucid all of this is done with virtually no CPU overhead and no latency compared to standard single GPU rendering.
To accompany this ability to intelligently divide up the graphics workload, Lucid is offering up scaling between GPUs of any KIND within a brand (only ATI with ATI, NVIDIA with NVIDIA) and the ability to load balance GPUs based on performance and other criteria. The load balancing is based on a couple of key data points: pre-existing knowledge from the Lucid team about the GPU in question and the "response time" of the GPU when being sent data from the HYDRA Engine chip. The HYDRA driver will actually recognize the GPUs in a system and will estimate how much processing power each holds but will then fine tune that estimate based on real-time performance of the GPU in action. If a GPU is sent a "task" to perform and the return time on it is slower than expected, the HYDRA engine will back off slightly and send more "tasks" to the less-loaded GPUs. All of this is updated on the fly, in real time as the game is running.
HYDRA Engine Hardware Implementation
From a purely hardware perspective, the HYDRA chip takes in a single PCIe x16 connection and outputs two full PCIe 2.0 x16 connections. Depending on the partner's implementation method, that could connect to two GPUs or split into four x8 PCIe 2.0 connections for four GPUs. What might you find the HYDRA chip on in the future? There are two likely scenarios for potential designs: on a motherboard or on a graphics board.
On a motherboard, including a HYDRA Engine chip would allow ANY chipset to support BOTH SLI and CrossFire technology since it is completely chipset independent and doesn't require SLI or CrossFire licensing. That would enable said motherboard to offer 2-4 GPU scaling with NVIDIA or AMD graphics cards - a VERY compelling solution but also likely an expensive one.
Pic 1 - Esquema de implementação numa Motherboard
The HYDRA technology would also likely find its way onto custom design graphics boards in place of the standard PCIe bridge - ala the Radeon HD 4870 X2. Lucid is claiming nearly linear scaling on up to 4 GPUs compared to 50-70% with SLI or CrossFire and thus a board vendor could really make a top performing part and stand out from the crowd or potential build one with slower chips for a new price-performance option.
As for the chip itself, obviously Lucid is being very close lipped about it. The chip runs very cool and draws just about 5 watts of power. Inside the chip you will find small RISC processor and the custom (secret sauce) logic behind the algorithm powering the HYDRA Engine. The production chip was JUST finished yesterday and will be sampling to partners soon - though they wouldn't indicate WHO those partners were.
FONTE
Pic 2 e Pic 3 - Rendering alternado, workload de cada gráfica mostrado em monitores separados
Sei que o post está enorme, mas não dá para resumir muito se queremos ter uma ideia correcta das possibilidades que isto trás para o mercado das gráficas..Honestamente não sei como é que a ATI ou a Nvidia ainda não se lembraram disto.
Por agora ainda só funciona com Dx9, mas em finais deste ano já deverá funcionar com o Dx10.1.
Implementações começam em princípios de 2009.