november Tech

flickr

Monday, October 09, 2006

Believe it or not, there are some people in major European countries that still lack internet access, or even computers. (Yes, we're still recovering from shock, too.) According to the French analyst firm Médiamétrie, half of the homes in France don't have a computer, and 60 percent don't have net access. In order to solve this problem, the French ISP firm Neuf Cegetel (they just bought AOL France) has launched its new internet access plan called Easy Neuf, in conjunction with its new cheap computer, the Easy Gate. This Linux box and internet service package can be yours for €40 ($50) per month, plus a €150 security deposit, and if you need a keyboard, mouse, monitor, and webcam, you'll have to fork over an additional one-time fee of €100 ($126). The Easy Gate packs an Intel 852GM chipset (no word on exactly what speed) and comes with six USB ports, 512MB of RAM and 512MB of flash memory, although we're not nearly enough to do anything but some light surfing and email (though we assume you can expand on that half gig with an external drive or two). Easy Neuf claims to serve up the internet at speeds of up to 8Mbps and includes unlimited VoIP calls to French landlines, so you can call your grandmother in Biarritz all you want. Beyond that, there's one more feature that we raise an eyebrow at -- Easy Gate's "proactive service monitoring" lets the company keep a remote eye on your PC and will fix it "without the customer having to call the help line."

Labels: Hardware

Author: Bo Tian » Comments:

0.99mm! Thinnest TFT-LCD

Friday, October 06, 2006

Toshiba Matsuchita has announced today that they would release the world's thinnest TFT-LCD panal, with a thickness of only 0.99mm. This panal can provide QVGA resolution of 320*240. The panel is 2.0" diagonally, and weighs only 3.5g.

Toshiba claims that future palm devices will be slicker with the new LCD panel. The first product to utilize this technology is estimated to be released at April 2007.

Labels: Hardware

Author: Bo Tian » Comments:

Stanford: ATI's GPU can Calculate Much Faster

Beyond3D recently sat down with Stanford's Mike Houston, GPGPU guru and B3D regular, to discuss Stanford's new Folding@Home client for ATI GPUs.

Beyond3D: Is the X1K series's dynamic branching performance the enabler that lets you really explore and exploit R580's (and R520's) abilities for GPGPU, and specifically GROMACS in BrookGPU in this case --in a way that is impossible on any other hardware right now? After that, which of the other abilities the chip has are key for GROMACS performance? The ability to sustain close to peak performance in the fragment hardware? Memory bandwidth? Basically what does the GROMACS core hit hard on the chip and how are you exploiting that in the application?

Mike Houston: All GPUs are SIMD, so branching has a performance consequence. We have carefully designed the code to have high branch coherence. The code heavily relies on a tremendous amount of looping in the shader. On ATI, the overhead of looping and branching can be covered with math, and we have lots of math. We run the fragment shaders pretty close to peak for the instruction sequence used, i.e. we can't fully use all the pre-adders on the ALUs. But, I wouldn't say branching is the enabler. I'd say the incredible memory system and threading design is what currently make the X1K often the best architecture for GPGPU. Those allow us to run the fragment engines at close to peak.

What ATI can do that NVIDIA can't that is currently important to the folding code being run is that we need to dynamically execute lots of instructions per fragment. On NVIDIA, the shader terminates after 64K instructions and exits with R0->3 in Color[0]->Color[3]. So, on NVIDIA, we have to multi-pass the shader, which crushes the cache coherence and increases our off-chip bandwidth requirements, which then exacerbates the below.

The other big thing for us is the way texture latency can be hidden on ATI hardware. With math, we can hide the cost of all texture fetches. We are heavily compute bound by a large margin, and we could actually drive many more ALUs with the same memory system. NVIDIA can't hide the texture latency as well, and perhaps more importantly, even issuing a float4 fetch (which we use almost exclusively to feed the 4-wide vector units) costs 4 cycles. So NVIDIA's cost=ALU+texture+branch, whereas ATI is MAX(ALU, texture, branch).

While it would be possible to run the code on the current NVIDIA hardware, we would have to make pretty large changes to the code they want to run, and even past that, the performance is not great. We will have to look at their next architecture and re-evaluate. The next chips from both vendors should be interesting.

Beyond3D: Are you using ATI's Close To the Metal (CTM) API in BrookGPU now, and are you using it for this first Folding@Home implementation? How is it helping BrookGPU get better on R580 and R520 in a theoretical sense?

Mike Houston: There is a BrookGPU CTM backend currently being worked on and we hope to have it public when CTM is public. It is not being used for the current algorithms running in the Folding client though. However, it will enable other algorithms we weren't able to do in the past because of access to larger register files (128 registers!), scatter, and explicit control of the memory formats and memory system. You can do really neat things with CTM that we couldn't before through Direct3D/OpenGL. Being able to render and texture directly from host memory makes debugging much easier, and also allows an easy mechanism for asynchronous transfer to and from the hardware.

The main thing for BrookGPU is that the overheads of GL and D3D go away and we have full control of setting up the board. No extra commands are sent to the board. Also, we can compile directly to the ISA, so we don't have to worry about game optimizations breaking our GPGPU code. This also means that since we talk directly to the board, we are immune to driver changes which makes verification and shipping of actual applications much easier.

CTM is really going to change the way that GPGPU is done and honestly to really do GPGPU for real, you must have low level access to the hardware. Having this access helps you to better bend the architecture to your will, and when that fails, better understanding of how to change your algorithm. Using CTM, we were able to get matrix multiplication on R580 up from ~15Gflops to ~120Gflops by having control over the memory system and formats.

Folding@Home is currently written in BrookGPU, and uses the D3D9 backend.

Beyond3D: Can you talk about how you structure GROMACS on a GPU given the GPU's architecture and how it performs?

Mike Houston: The general key is to try to restructure your code to be compute, not bandwidth, bound. This is often difficult, but you can work to restructure your data access patterns to get the best use out of the memory system. GPGPU.org is a fantastic resource on tips and tricks for GPGPU.

Labels: Hardware

Author: Bo Tian » Comments:

New Apple Tablet

Wednesday, October 04, 2006

The Apple rumour mill’s going crazy over speculation the Mac makers are prepping an ultra-portable computer for release next year, but what new technologies are likely to appear in such a secretive device?

For a start we’d expect a touchscreen, something Apple’s been patenting like crazy for the best part of a year.

That screen could even be a multi-touch version, or offer feedback in the form of a physical clicking sensation – Apple’s patented this recently too.

Then there’re those reports that the ‘Podfathers at Cupertino were working on a crazy camera screen, with lenses hidden behind a computer’s display. That means the portable’s LCD could double as an iSight for use in a mobile version of iChat.

Of course, you’ll need a way to control an Apple tablet, and it’s far too boring to rely on a stylus. How about a sensor that knows when your hand’s approaching? Yeah, that’d be cool. Or how about clever on-screen keyboards, so smart they can cope with super-fast touch typing?

What we’d really like to see is an A5-sized pad, easily pocketable but with enough grunt, and speedy memory, to handle programmes like Photoshop even while we’re out and about, as well as toting iLife apps with all the grace of a desktop machine.

There’s rumour that Apple’s working on a flash-based portable for release early next year, lets just hope it’s not another UMPC let down, and comes clad in Apple’s super stylish skin.

Labels: Hardware

Author: Bo Tian » Comments:

CORSAIR's new 1111MHz DDR2 RAM

Sunday, October 01, 2006

Recently, Corsair and OCZ launched 1111MHz and 1120MHz DDR2 RAM respectively. At this time, making DDR2-800 is a great challenge in technology to the manufactures, and making RAMs above 1GHz is a farther test of technical expertise.

Corsair said that their new Dominator series of RAMs can work at 1.11GHz and CAS4, but making such RAMs is extremely difficult. These RAMs are results of testing received chips from Micron. The default manufacturing standard of these chips is 800MHz, CAS3. Only less than 5% of total chips received can reach 1111MHz, and only 0.5% is used for manufacturing of the Dominator series due to the concerns that all chips making up the RAM must work on identical frequency and latency.

According to news reports, Corsair's DDR-1111 has only released less than 200 pieces globally. So even if you can bear to pay $600 for this monster, you may not get the product.

Labels: Hardware

Author: Bo Tian » Comments: