Course Calendar

Optimize Your Code for the Intel XEON Phi 4/25

Host Site:

Texas Advanced Computing Center

Host site URL:

Optimize Your Code for the Intel XEON Phi 4/25

April 25, 2013 (Thursday)
8:30 a.m. to 5 p.m. (CT)
ROC 1.900
Texas Advanced Computing Center

J.J. Pickle Research Campus
10100 Burnet Rd.
Austin, TX 78758

This is an in person class. There will be no webcast.

This class is intended for intermediate to advanced users of Stampede. Attendees are expected to be able to program using MPI and OpenMP.

The Innovative Technology component of the recently deployed XSEDE Stampede supercomputer at TACC provides access to 8 PetaFlops of computing power in the form of the new Intel Xeon Phi Coprocessor, also known as MIC. While the MIC is x86 based, hosts its own Linux OS, and is capable of running most user codes with little porting effort, the MIC architecture has significant features that are different from that of present x86 CPUs, and optimal performance requires an understanding of the possible execution models and basic details of the architecture. This workshop is designed to introduce Stampede users to the MIC architecture in a practical manner. Multiple lectures and hands-on exercises will be used to get the user acquainted with the MIC platform and explore the different execution modes as well as parallelization and optimization through example testing and reports. Users are also welcome to bring their own codes to compile for MIC.

The workshop will be divided in four sections: Introduction to the MIC architecture; native execution and optimization; offload execution; and symmetric execution. In each section the users will spend half the time doing guided hands-on exercises.


PART I – Introduction (1.5 hours) 8:30-10:00

Xeon Phi Architecture
Programming models
Native Execution (MPI / Threads / MPI+Threads )
MPI on host and Phi
MPI on host, offload to Phi
Automatic (MKL)
Offload to host from the Phi


Login and explore busybox

BREAK 10:00 -10:30

PART II – Native Execution (1.5 hours)

Native Execution
Why run native?
How to build a native application?
How to run a native application?
Best practices for running native
Cache + ALU/SIMD details
Compiler reports


Interactive exercise using compiler reports
Interactive exercise to show logical to physical proc mapping

LUNCH 12:00 – 1:00

PART III – Offload Execution (2 hours hours) 1:00 – 3:00

Offload to Phi
What is offloading?
Automatic offloading with MKL
Compiler assisted offloading
Offloading inside a parallel region


Interactive exercise with simple offload and data transfer

BREAK 3:00 – 3:30

PART IV – Symmetric Execution (1.5 hours) 3:30 – 5:00
MPI execution
Symmetric execution
Workload distribution
Correct pinning of MPI tasks on host and coprocessor
Interactive exercise showing symmetric at work
MPI + offload
Pinning tasks to host and MIC


Exercise with symmetric execution and pinning


In person (Texas Advanced Computing Center)

04/25/2013 08:30 - 04/25/2013 17:00 CDT (SESSION HAS ENDED)
View Session Details →
Registration CLOSED
Registration open date
03/29/2013 15:15 CDT
Registration close date
04/18/2013 16:00 CDT
Class size restriction
40 registrants

(34 spots left)


0 registrants

Contact Information
Bob Garza
Contact phone
Contact email
Texas Advanced Computing Center
J.J. Pickle Research Campus
10100 Burnet Rd., ROC 1.900
Austin, TX 78758
Posted: 03/29/2013 20:23 UTC