Intel Inside and Out: Fall 2005 IDF

Our new tech analyst, Victor Loh, spent some time in the weeds of Intel Developer Conference sessions last week. These sessions are the real meat of IDF, so check out his reports on the various topics that will enable new features and affect performance in the upcoming year.

Development Tools: Multicore and Mac OS X
Intel’s much-hyped multicore technology doesn’t translate to better performance unless we can get our hands on software that takes advantage of its additional processing power potential. To make sure coders start cranking out software for the new architecture, Intel is providing the tools to help developers optimize their code for the multicore applications: Intel Compilers and Intel Performance Libraries.

There are three primary forms of multicore utilization—independent programs, parallel programs, and memory latency reduction.

One way to utilize multicore capabilities is with independent programs. The basic concept is multi-tasking with multiple different programs running at once.

Parallel programs consist of a single application divided up into multiple application threads. An example would be rendering two parts of a screen graphic at the same time on separate cores. But the possibility of thread switching and thread interaction adds another level of complexity that must be addressed in the software development process. The code may require locks that only allow a single process to access a chunk of data at a time.

A third method of ramping up to multicore performance is by reducing memory latency. In order to diminish the lag caused by disparities between the speed at which data can be processed at in the CPU and the time it takes to write to and read from the memory cache, programmers can plan ahead by “warming up” caches and pre-fetching memory data to keep the core data-feed constant. This method incorporates a “helper” thread in addition to the main application thread to perform the memory loading tasks. While implementing these strategies into multicore-friendly code may sound daunting, Intel has provided some helpful tools. Working with Intel’s own compiler allows developers to perform advanced optimization. With Autoparallelization, the compiler can help generate much of the threaded code by threading loops where possible. Tools like Autoparallel and OpenMPcan extract the benefits of multicore performance through optimized software development.

For all phases of single and multicore software development, from coding and correctness checking to performance analysis and algorithm tuning, Intel has provided a full set of tools for Windows and Linux OS environments. Continued… Intel announced its intention to porting its compilers, libraries, and optimization tools over to the Mac OS software development platform. The Intel compilers will support C++ and Fortran. Easy migration from Mac to Intel architecture is supposedly made possible with GCC 4.0 in OS X, which is also binary mix-and-match compatible.ICC will plug into Xcode, Apple’s suite of development tools that comes free with Mac OS X. However, Intel will not provide a compiler for Objective-C. The company has also published information about beta versions of Intel’s compiler and performance libraries for the Mac OS. Continued… Intel is supporting a number of technology initiatives to ensure that the necessary infrastructure and standards will exist to support its integrated digital home entertainment vision and a ubiquitous Intel presence across several industries. Naturally, the company wants emerging standards will be fully compatible with Intel hardware and software. The Fall 2005 “Tech-a-Palooza” showcase outlined some of the current progress on standards initiatives.The Universal 3-D(U3D) project “brings 3-D images to mainstream applications by creating a single extensible format for sharing 3D data in any application.” Intel is essentially pushing the open U3D format to do for 3D what JPEG did for 2D. Adobe Acrobat 7.0, for instance, presently supports U3D. Future plans include even more standards approvals (ISO, IE3) to mainstream U3D and include enhanced features like high-order surfaces, shading, and mesh compression.

Ensuring interoperability for sharing digital content like music and videos among PCs, consumer electronics, and mobile devices in the home and beyond is the mission of the Digital Living Network Alliance (DLNA). A coalition of 249 member companies at present, the DLNA aims to coordinate interoperability initiatives among related industries. It provides guidelines for interoperability such as protocol-level requirements for content sharing. They’ve also established mandatory and optional media file formats for imaging, audio, and video. Upcoming certification and logo programs will give vendors the opportunity to advertise their products’ DLNA compliance. Continued… 802.11n is touted as the next generation Wireless LAN standard. Intel has been galvanizing cross-industry support for the 802.11n standard because of its potential for scalability and interoperability beyond just PCs to consumer electronics and mobile communications devices. The high-speed throughput of 802.11n is expected to be at least 500% faster than 802.11a/g. This amounts to an actual minimum throughput of about 100 Mbps, taking into account overhead issues like packet fragmentation. The improved coverage in the digital home promises to eliminate dead zones and boost coverage in hotspot zones.Complementary to 802.11n, mesh networking (802.11s) creates a truly wireless solution by minimizing and easing configuration issues. Intelligence built into the wireless devices themselves will allow them to automatically discover connectivity on their own. Eliminating the burdensome network configuration and troubleshooting so many of us have faced at home and the workplace will be a welcome relief. 802.11s implementation will enable interoperability and signal penetration through obstructions such as metal walls where varying link quality would normally degrade or derail connectivity. What other benefits can we expect at home? On an 802.11s network, QoS-sensitive traffic such as streaming video will be able to choose the most efficient means from a selection of multiple paths for optimal performance. Continued… In the more intimate personal-area-networking space, Intel has pledged to do the same “heavy lifting” for wireless USB that it did to make USB 2.0 as universally adored as it is today. Developing a wireless host control interface specification allows the separation of hardware and software development. A host-centered design also reduces the complexity required on the device side. Similar to wired USB, WUSB devices will not require a processor. Like other wireless standards, the secret ingredient is interoperability. With any luck, we could be seeing applications for WUSB’s 480 Mbps effective bandwidth such as multiple streaming HDTV signals as early as 2006.It remains to be seen how WUSB and the other ultrawideband (UWB) technologies will reconcile themselves in the minds of consumers. Right now, the usage models overlap substantially, so some consumer confusion is likely to occur.

Intel is still betting that the wired backbone will be a fundamental component of home network connectivity. The HomePlug Powerline Alliance is tasked with promoting an open specification for interoperable, standards-based powerline networking. Powerline data transmission on the recently approved HomePlug AV specification can reach 200 Mbps. The HomePlug Broadband-over-powerline specification, based on the AV standard, is expected to be ratified by the middle of 2006. Continued… Not surprisingly, Intel has taken a genuine interest in catering to the PC enthusiast crowd in recent years. Not only are these early adopters passionate about extracting the maximum performance from the “fastest, latest, and greatest” hardware, they’re veritable computer experts (often working in IT themselves) whose opinions influence the purchasing decisions of colleagues and friends. The niche they occupy is quite small; they only account for the top 5% of total desktop purchases. But their influence as trendsetters has gotten Intel’s attention and earned serious gamers a degree of respect in the industry as educated and empowered consumers.During this session, ATI’s Daniel Taranovsky shared some information about CrossFire, ATI’s high-end dual-GPU solution. A characteristic feature that differentiates CrossFire from Nvidia’s comparable SLI architecture is a separate chip referred to as the compositing engine. It’s primarily used for blending results from individual GPU processing to render a scene. It lends the card flexibility, since other features such as math calculations (divides, gamma correction) can be integrated into the compositing engine as well. Continued… The CrossFire supports three primary performance modes of dividing up the processing workload: Scissor, SuperTile, and Alternate Frame.The first type, scissor, divides each frame into top and bottom halves. Each GPU is tasked with rendering one half of each frame.SuperTile, the second type, splits each frame into 32 x 32 pixel tiles like a chess board. This is superior solution to scissor in terms of load balancing. It’s possible, for example, to load a single processor more heavily in scissor mode if a lot of action is taking place in the bottom half of the screen. The unloaded GPU is forced to wait for the other, and resultant performance is suboptimal. The SuperTile configuration partitions the screen with better granularity, allowing GPU processing loads to be distributed in a more even fashion.

The third mode of operation, alternate frame mode, splits the workload frame by frame. GPU 1 renders all the odd frames, while GPU 2 renders all the even frames. This mode allows full geometry acceleration. Another feature ATI spent some time plugging was its SuperAA Image Quality mode. Allowing up to 12x and 14x AA with dual-GPU antialiasing, the approach takes multiple samples per pixel to determine the best sample pattern. Individual strands of hair, for example, can sometimes be less than a single pixel wide. ATI claims that its SuperAA’s robust sample pattern will effectively smooth out this type of micro-geometry and add finer detail to object contours. We can’t wait to find out how it stacks up against Nvidia’s SLI “16x” AA solution. Finally, ATI’s CrossFire provides a little more flexibility when it comes to video card selection. You can link any two of the Radeon x850 family cards together to get the CrossFire benefits. Same goes for the Radeon x800 card family. The final bit of flexibility offered by ATI is validation of CrossFire graphics cards on motherboards using Intel chipsets. Currently, two motherboards using the Intel 955X chipset are shipping with two PCI Express x16 slots. ASUS is shipping its P5WD2 series, while Intel is shipping the D955XBK.

The major issue for ATI, though, is ship dates. Several OEM contacts informed us that CrossFire was probably a month away from shipping, so it looks like CrossFire may ship at roughly the same time as the new R520 GPU. We’ll have to just wait and see.

This article was originally published on extremetech.com.

Intel Inside and Out: Fall 2005 IDF

Victor Loh

Recommended for you...

Company

Categories