Today it is time for some theory, involving some simple maths and physics as well. Because I know you love it! The topic for this lecture is Enhanced Intel Speedstep (EST). So if you have AMD this is not for you, so you may take the day off and enjoy the sun.
Short background
EST is the next iteration of Intel Speedstep technology. And was introduced in the Pentium M processors. The idea behind speedstep is to be able to conserve power by lowering voltage and operating frequency. But why would we want to do that?
Todays power-hungry CPU's are able to draw almost 100W of power. At the same time the PC has become mobile and also a living-room object in line with Radio, TV and CD/DVD players. Something mobile would be useless if you had to draw power from the mains, so you install a battery, as well it should be as small as possible. But batteries don't have unlimited power, so you will need to conserve it. Aha. And smaller space means less air circulation, and as you might now, power creates heat and to dissipate heat you need some way to move it. I will not dwell with heat physics, but common ways to "move" heat is by air flow. So in the case here you can choose between more flow or lesser heat. More flow would mean bigger PC's and that is not good for a laptop. So we want to reduce heat and therefore need to reduce power. I also mentioned th living room PC. It is of course usually connected to the mains, but you don't want a big box there would you? Maybe you do if you are a man, but your wife might most likely not. So we are back to the air flow/heat dissipation theory again. Another issue here is also noise. One way to create air flow is by using fans. The theory here is commonly bigger fans more silent and faster speed to create more air flow means more noise. So you should be able to figure out the math why we want lesser power. Then of course there is the environment issue, I don't know how many PC's are around this world, but if you have a small city with 100 000 citizens, and on average they do use on average at least one PC each, either at home or work. If we say that each produce 0.4 kWh each that is 40 MWh, so there is definitely an intention to save power.
Power Management
So how can we save power? The basic formula for power (P) is P = V * I. V is the voltage and I is the current. So this is simple then? Reduce voltage and current. And that is exactly what EST is doing. But the observant reader might ask, why then do it in steps? So here comes the reason and idea. The modern PC is quiet a powerful workhorse, it is actually so powerful that it has most likely finished what it need to do long before you are able to react. Which means it is idle most of the time. And a CPU that is idle still needs to do something as it knows nothing but calculations and operations. So what we do, we learn it to work slower, more effective, to relax and to sleep. Of course it doesn't need it because it needs rest, it is perfectly able to work day and night nonstop. But it need it to save power. So the CPU get different states of operation.
States and time
There is now time to get more technical here. First we need to learn some terminology. Different states got different names, when the CPU works slower we say that it has been throttled. Therefore we call it T-states, T is for throttle in case you already missed that. When it is sleeping we call it surprisingly for S-states, here S is for sleep not surprise! We have also states when the CPU neither works or sleep called C-states, C is an abbreviation for idle(?) or rather CPU idle states. The last and what we will take a look at in this essay is the P-states and is part effectiveness and part how fast or how much it performs. The P is for performance and not power as is commonly believed. So in short; P-states = EST. There are also other states defined (i.e. Device and Global states) but I will not go into that now. Another important term we need before going further is latency. Latent means to lie hidden or dormant and Latency is adding the time factor into it, commonly called delay. Latency is how long it takes for the CPU to change its state and not uncommonly it is not effective during that time. Unfortunately we don't live in a perfect world so it is not jumping at an infinite short time to the next state.
Predictions and performance
Now we know P = V * I. EST = Performance States. And Latency is delay. What else do we need? As the smart reader you are you of course already asked yourself when. When do we change states? It is easy to know when to slow down. Because when there is nothing to do you slow down. But then? Why need all these states? Can't we just jump from max to sleep? And then to max when we start working again? The clue here is of course the latency. It is like driving a car in the city, when there are traffic lights it takes time to get back to speed after getting the green. This of course affects the CPU as well. Luckily for us, P-states have very short latency (as opposed to idle and sleep states) and the shorter the steps the lower the latency. I am sure there are and one could write books about predictions, so I will not take away your time nagging about it. You just need to know that it exists and it is the operating system or P-states drivers mission to find out when to apply the next state. Of course some have means for you as an end user to control some factors depending on your needs and environment. Commonly called profiles.
Enhanced Intel Speedstep
So, the basic theory is there. So we go into the operational part of EST. The control is your operating systems task, it can be defined in ACPI or in other means. The low level control is done by MSR's in the CPU. MSR is short for Model-Specific Registers. And controls a wide aspects of your CPU's operation. For EST there are two important MSR's which are IA32_PERF_STATUS (0x198) and IA32_PERF_CONTROL (0x199). We will call it STS and CTL. STS is as you might guess the status (read only) and CTL is the control (read/write). There are other MSR's that also controls some aspects like enabling EST and features. I have not talked about cores in the processor and will not do it either except that for that it is per core, but only on i7 and newer CPU's are you able to have different P-states per core, IDA state being an exception (I will mention it later).
Each MSR are a 64 bit word. STS have the following structure:
Bits: 63-48 - E, 47-32 - D, 31-24 - C, 23-16 - B, 15-0 - A
A, D and E are each a 16 bit Performance State Value (PSV). A is the current, D is the maximum and E is the power-on. Current obviously changes when new states are requested, maximum is mostly the same and power-on doesn't change. Power-on is usually the same as maximum on desktop and server CPU's but will be the lowest on mobile CPU's. C is the minimum multiplier, and almost all CPU's from Intel this is 6. On CPU's older than the Core family this was not used and will show 0. B are some status bits and tells if there are THERM or EST events (i.e. a P-state change is occurring or and overtemp state is discovered)
CTL have the following structure:
Bits: 63-33 - Reserved, 32 - IDA disengage, 31-16 - Reserved, 15-0 - PSV
The PSV word is the requested state you want for the CPU. And bit 32 should be set to not engage IDA. To write to this MSR you have to use a sequency called read-modify-write. Which basically means you read the MSR first, modify the word to your liking, and writes it back to the MSR.
The PSV structure:
Bits: 15 - Dynamic FSB, 14 - Non-integer bus ratio, 13 - Reserved, 11-8 - Frequency Identifier, 7-6 - Reserved, 5-0 - Voltage Identifier
Dynamic FSB is also called the Super Low Frequency Mode (SLFM) and what it does is skipping every second cycle of the FSB, so in effect you are able to halve the operating frequency. Bus ratio is what commonly is called the multiplier. The CPU operating frequency is created by multiplying the Front Side Bus (FSB) frequency. Some CPU's are able to multiply with half integers, so you are able to more finely control the operating frequency. Frequency Identifier (FID) is an ID for each frequency, and usually correlates to the full integer multiplier. Voltage Identifier (VID) is an ID for each voltage step.
Voltage Identifier
A great deal of myth surrounds this value, mostly due to the fact that this information is not readily available from Intel without a Non-Disclosure Agreement (NDA). But we will start with some maths and a formula:
Vcc = Vid0 + (VID * Vstep)
Vcc is the current voltage, Vid0 is the voltage for VID = 0, and Vstep is how much each step is. Here is table for some common CPU's:
CODE
Table (All Values in mV)
CPU series Vid0 Vstep Vboot Vmin Vmax
Pentium M 700,0 16,0 xxxx,x xxxx,x xxxx,x
E6000, E4000 825,0 12,5 1100,0 850,0 1500,0
E8000, E7000 825,0 12,5 1100,0 850,0 1362,5
X9000 712,5 12,5 1200,0 800,0 1325,0
T9000 712,5 12,5 1200,0 750,0 1300,0
P9000, P8000 712,5 12,5 1200,0 750,0 1300,0
Q9000D, Q8000D 825,0 12,5 1100,0 850,0 1362,5
Q9000M 712,5 12,5 1200,0 850,0 1300,0
CPU series Vid0 Vstep Vboot Vmin Vmax
Pentium M 700,0 16,0 xxxx,x xxxx,x xxxx,x
E6000, E4000 825,0 12,5 1100,0 850,0 1500,0
E8000, E7000 825,0 12,5 1100,0 850,0 1362,5
X9000 712,5 12,5 1200,0 800,0 1325,0
T9000 712,5 12,5 1200,0 750,0 1300,0
P9000, P8000 712,5 12,5 1200,0 750,0 1300,0
Q9000D, Q8000D 825,0 12,5 1100,0 850,0 1362,5
Q9000M 712,5 12,5 1200,0 850,0 1300,0
One thing to note is that it is not valid for Netburst (Pentium D and Pentium 4 CPU's) they have a somewhat complicated table, which I will introduce later if needed. Those CPU's usually do not support EST anyway. If you have other CPU's not listed I could add them.
Number of P-states and ACPI
How many P-states do my CPU support? This is a common question..... The answer is not so simple as you might believe. And how many is not really very interesting. As you might deduce from the information above there are quiet a lot of VID's and FID's that might be combined. It is a 16 bit value so theoretically 65535, of course you have to remove the reserved values and be between the CPU's operating limits. But you get the picture. I believe that you need at most maybe 10 and are fully able to do with 3-5 or even 2. The difference will not likely be big. But create as many as you feel like.
So then it is defining them. They should already be some that are created by your motherboard factory in your ACPI BIOS. But I will make an example of how you can create a table of P-states.
In this example I will use the T9900 CPU and 45nm 3.06GHz processor. We need some information to make the table. As I don't have this CPU myself I will base this entirely on specifications. What I know from spec is: bus/core ratio is 11.5 with a bus speed of 1066MHz, it supports SLFM and IDA and Vcc range is from 750 to 950 mV in SLFM, 850 to 1100 in LFM and 1000 to 1250 in HFM. In IDA mode limits are from 1000 to 1300 mV. I want to have 5 P-states with frequencies from 800MHz to 3.06GHz. And a IDA mode.
Here is an example how it may look like:
CODE
Frequency Multiplier FID Voltage VID Power
P0 3333 MHz 12.5 0x4C 1200 mV 0x27 56,4 W
P1 3066 MHz 11.5 0x4B 1125 mV 0x21 49,5 W
P2 2667 MHz 10.0 0x0A 1050 mV 0x1B 42,5 W
P3 2133 MHz 8.0 0x08 988 mV 0x16 35,5 W
P4 1600 MHz 6.0 0x06 913 mV 0x10 28,7 W
P5 800 MHz 6.0 0x86 850 mV 0x0B 19,0 W
P0 3333 MHz 12.5 0x4C 1200 mV 0x27 56,4 W
P1 3066 MHz 11.5 0x4B 1125 mV 0x21 49,5 W
P2 2667 MHz 10.0 0x0A 1050 mV 0x1B 42,5 W
P3 2133 MHz 8.0 0x08 988 mV 0x16 35,5 W
P4 1600 MHz 6.0 0x06 913 mV 0x10 28,7 W
P5 800 MHz 6.0 0x86 850 mV 0x0B 19,0 W
By lowering frequency you lower the current (I) in the CPU, and by lowering voltage you lower the voltage (!).
Conclusion
I hope this helped and answered a few of the questions about EST and P-States. I welcome any comments, and if there are additions or errors you discover please let me know so I may correct them or add it. I might look at the other states like Idle and Sleep states later if my time and your interest is there.
--- Superhai (2nd of October, 2009) : last update 02-10-2009
References:
http://processorfinder.intel.com/Default.aspx
http://www.intel.com/products/processor/index.htm
http://www.intel.com/products/processor/manuals/index.htm
This post has been edited by Superhai the Great: Nov 1 2009, 02:51 PM





Oct 2 2009, 01:00 AM





