ProjectOSX

Welcome Guest!

Returning User? Login here.

Want membership privileges? Register here.

 
Start a new topic Add Reply
> Enhanced Intel Speedstep, The mystery about VID's, FID's, IDA, power, multipliers explai
Superhai the Gre...
post Oct 2 2009, 01:00 AM
Post #1
Good evening everyone!

Today it is time for some theory, involving some simple maths and physics as well. Because I know you love it! The topic for this lecture is Enhanced Intel Speedstep (EST). So if you have AMD this is not for you, so you may take the day off and enjoy the sun.

Short background

EST is the next iteration of Intel Speedstep technology. And was introduced in the Pentium M processors. The idea behind speedstep is to be able to conserve power by lowering voltage and operating frequency. But why would we want to do that?

Todays power-hungry CPU's are able to draw almost 100W of power. At the same time the PC has become mobile and also a living-room object in line with Radio, TV and CD/DVD players. Something mobile would be useless if you had to draw power from the mains, so you install a battery, as well it should be as small as possible. But batteries don't have unlimited power, so you will need to conserve it. Aha. And smaller space means less air circulation, and as you might now, power creates heat and to dissipate heat you need some way to move it. I will not dwell with heat physics, but common ways to "move" heat is by air flow. So in the case here you can choose between more flow or lesser heat. More flow would mean bigger PC's and that is not good for a laptop. So we want to reduce heat and therefore need to reduce power. I also mentioned th living room PC. It is of course usually connected to the mains, but you don't want a big box there would you? Maybe you do if you are a man, but your wife might most likely not. So we are back to the air flow/heat dissipation theory again. Another issue here is also noise. One way to create air flow is by using fans. The theory here is commonly bigger fans more silent and faster speed to create more air flow means more noise. So you should be able to figure out the math why we want lesser power. Then of course there is the environment issue, I don't know how many PC's are around this world, but if you have a small city with 100 000 citizens, and on average they do use on average at least one PC each, either at home or work. If we say that each produce 0.4 kWh each that is 40 MWh, so there is definitely an intention to save power.

Power Management

So how can we save power? The basic formula for power (P) is P = V * I. V is the voltage and I is the current. So this is simple then? Reduce voltage and current. And that is exactly what EST is doing. But the observant reader might ask, why then do it in steps? So here comes the reason and idea. The modern PC is quiet a powerful workhorse, it is actually so powerful that it has most likely finished what it need to do long before you are able to react. Which means it is idle most of the time. And a CPU that is idle still needs to do something as it knows nothing but calculations and operations. So what we do, we learn it to work slower, more effective, to relax and to sleep. Of course it doesn't need it because it needs rest, it is perfectly able to work day and night nonstop. But it need it to save power. So the CPU get different states of operation.

States and time

There is now time to get more technical here. First we need to learn some terminology. Different states got different names, when the CPU works slower we say that it has been throttled. Therefore we call it T-states, T is for throttle in case you already missed that. When it is sleeping we call it surprisingly for S-states, here S is for sleep not surprise! We have also states when the CPU neither works or sleep called C-states, C is an abbreviation for idle(?) or rather CPU idle states. The last and what we will take a look at in this essay is the P-states and is part effectiveness and part how fast or how much it performs. The P is for performance and not power as is commonly believed. So in short; P-states = EST. There are also other states defined (i.e. Device and Global states) but I will not go into that now. Another important term we need before going further is latency. Latent means to lie hidden or dormant and Latency is adding the time factor into it, commonly called delay. Latency is how long it takes for the CPU to change its state and not uncommonly it is not effective during that time. Unfortunately we don't live in a perfect world so it is not jumping at an infinite short time to the next state.

Predictions and performance

Now we know P = V * I. EST = Performance States. And Latency is delay. What else do we need? As the smart reader you are you of course already asked yourself when. When do we change states? It is easy to know when to slow down. Because when there is nothing to do you slow down. But then? Why need all these states? Can't we just jump from max to sleep? And then to max when we start working again? The clue here is of course the latency. It is like driving a car in the city, when there are traffic lights it takes time to get back to speed after getting the green. This of course affects the CPU as well. Luckily for us, P-states have very short latency (as opposed to idle and sleep states) and the shorter the steps the lower the latency. I am sure there are and one could write books about predictions, so I will not take away your time nagging about it. You just need to know that it exists and it is the operating system or P-states drivers mission to find out when to apply the next state. Of course some have means for you as an end user to control some factors depending on your needs and environment. Commonly called profiles.

Enhanced Intel Speedstep

So, the basic theory is there. So we go into the operational part of EST. The control is your operating systems task, it can be defined in ACPI or in other means. The low level control is done by MSR's in the CPU. MSR is short for Model-Specific Registers. And controls a wide aspects of your CPU's operation. For EST there are two important MSR's which are IA32_PERF_STATUS (0x198) and IA32_PERF_CONTROL (0x199). We will call it STS and CTL. STS is as you might guess the status (read only) and CTL is the control (read/write). There are other MSR's that also controls some aspects like enabling EST and features. I have not talked about cores in the processor and will not do it either except that for that it is per core, but only on i7 and newer CPU's are you able to have different P-states per core, IDA state being an exception (I will mention it later).

Each MSR are a 64 bit word. STS have the following structure:

Bits: 63-48 - E, 47-32 - D, 31-24 - C, 23-16 - B, 15-0 - A

A, D and E are each a 16 bit Performance State Value (PSV). A is the current, D is the maximum and E is the power-on. Current obviously changes when new states are requested, maximum is mostly the same and power-on doesn't change. Power-on is usually the same as maximum on desktop and server CPU's but will be the lowest on mobile CPU's. C is the minimum multiplier, and almost all CPU's from Intel this is 6. On CPU's older than the Core family this was not used and will show 0. B are some status bits and tells if there are THERM or EST events (i.e. a P-state change is occurring or and overtemp state is discovered)

CTL have the following structure:

Bits: 63-33 - Reserved, 32 - IDA disengage, 31-16 - Reserved, 15-0 - PSV

The PSV word is the requested state you want for the CPU. And bit 32 should be set to not engage IDA. To write to this MSR you have to use a sequency called read-modify-write. Which basically means you read the MSR first, modify the word to your liking, and writes it back to the MSR.

The PSV structure:

Bits: 15 - Dynamic FSB, 14 - Non-integer bus ratio, 13 - Reserved, 11-8 - Frequency Identifier, 7-6 - Reserved, 5-0 - Voltage Identifier

Dynamic FSB is also called the Super Low Frequency Mode (SLFM) and what it does is skipping every second cycle of the FSB, so in effect you are able to halve the operating frequency. Bus ratio is what commonly is called the multiplier. The CPU operating frequency is created by multiplying the Front Side Bus (FSB) frequency. Some CPU's are able to multiply with half integers, so you are able to more finely control the operating frequency. Frequency Identifier (FID) is an ID for each frequency, and usually correlates to the full integer multiplier. Voltage Identifier (VID) is an ID for each voltage step.

Voltage Identifier

A great deal of myth surrounds this value, mostly due to the fact that this information is not readily available from Intel without a Non-Disclosure Agreement (NDA). But we will start with some maths and a formula:

Vcc = Vid0 + (VID * Vstep)

Vcc is the current voltage, Vid0 is the voltage for VID = 0, and Vstep is how much each step is. Here is table for some common CPU's:
CODE
Table (All Values in mV)
    
    CPU series     Vid0     Vstep     Vboot     Vmin     Vmax
    Pentium M       700,0    16,0     xxxx,x    xxxx,x   xxxx,x
    E6000, E4000    825,0    12,5     1100,0     850,0   1500,0
    E8000, E7000    825,0    12,5     1100,0     850,0   1362,5
    X9000           712,5    12,5     1200,0     800,0   1325,0
    T9000           712,5    12,5     1200,0     750,0   1300,0
    P9000, P8000    712,5    12,5     1200,0     750,0   1300,0
    Q9000D, Q8000D  825,0    12,5     1100,0     850,0   1362,5
    Q9000M          712,5    12,5     1200,0     850,0   1300,0


One thing to note is that it is not valid for Netburst (Pentium D and Pentium 4 CPU's) they have a somewhat complicated table, which I will introduce later if needed. Those CPU's usually do not support EST anyway. If you have other CPU's not listed I could add them.

Number of P-states and ACPI


How many P-states do my CPU support? This is a common question..... The answer is not so simple as you might believe. And how many is not really very interesting. As you might deduce from the information above there are quiet a lot of VID's and FID's that might be combined. It is a 16 bit value so theoretically 65535, of course you have to remove the reserved values and be between the CPU's operating limits. But you get the picture. I believe that you need at most maybe 10 and are fully able to do with 3-5 or even 2. The difference will not likely be big. But create as many as you feel like. 

So then it is defining them. They should already be some that are created by your motherboard factory in your ACPI BIOS. But I will make an example of how you can create a table of P-states. 

In this example I will use the T9900 CPU and 45nm 3.06GHz processor. We need some information to make the table. As I don't have this CPU myself I will base this entirely on specifications. What I know from spec is: bus/core ratio is 11.5 with a bus speed of 1066MHz, it supports SLFM and IDA and Vcc range is from 750 to 950 mV in SLFM, 850 to 1100 in LFM and 1000 to 1250 in HFM. In IDA mode limits are from 1000 to 1300 mV. I want to have 5 P-states with frequencies from 800MHz to 3.06GHz. And a IDA mode. 

Here is an example how it may look like:
CODE
   Frequency   Multiplier  FID   Voltage   VID   Power
P0 3333 MHz     12.5       0x4C  1200 mV   0x27   56,4 W
P1 3066 MHz     11.5       0x4B  1125 mV   0x21   49,5 W
P2 2667 MHz     10.0       0x0A  1050 mV   0x1B   42,5 W
P3 2133 MHz      8.0       0x08   988 mV   0x16   35,5 W
P4 1600 MHz      6.0       0x06   913 mV   0x10   28,7 W
P5  800 MHz      6.0       0x86   850 mV   0x0B   19,0 W


By lowering frequency you lower the current (I) in the CPU, and by lowering voltage you lower the voltage (!). 

Conclusion

I hope this helped and answered a few of the questions about EST and P-States. I welcome any comments, and if there are additions or errors you discover please let me know so I may correct them or add it. I might look at the other states like Idle and Sleep states later if my time and your interest is there. 

--- Superhai (2nd of October, 2009) : last update 02-10-2009
References:


http://processorfinder.intel.com/Default.aspx

http://www.intel.com/products/processor/index.htm


http://www.intel.com/products/processor/manuals/index.htm

This post has been edited by Superhai the Great: Nov 1 2009, 02:51 PM
I have decided to leave the OSX86 scene, but if there are any particularities you need to address arising from my contributions, you may contact me on voodoo@superhai.com
If you want to work on any of my sources or projects, or if you are interested in hosting the files I provided you may use the same address to arrange a way to get hold of them. Any requests or questions for unrelated support, or other unrelated requests will be blacklisted, banned or regarded as an abuse and dealt with accordingly. 
Of course community friends may also use the address and you know who you are.
shoarthing
post Oct 4 2009, 08:25 AM
Post #2
Thank you. Could you include the Atom family - specifically the useful 330?

Intel's compact *.pdf is here - Section 3.4over pps 17~18 seem to contain at least an outline of the information you need; but would appreciate if you would check.
cVad
post Oct 4 2009, 10:17 AM
Post #3
QUOTE (Superhai the Great @ Oct 2 2009, 05:00 AM) *
...Here is an example how it may look like:
CODE
   Frequency   Multiplier  FID   Voltage   VID   Power
P0 3333 MHz     12.5       0x4C  1200 mV   0x27   56,4 W
P1 3066 MHz     11.5       0x4B  1125 mV   0x21   49,5 W
P2 2667 MHz     10.0       0x0A  1050 mV   0x1B   42,5 W
P3 2133 MHz      8.0       0x08   988 mV   0x16   35,5 W
P4 1600 MHz      6.0       0x06   913 mV   0x10   28,7 W
P5  800 MHz      6.0       0x86   850 mV   0x0B   19,0 W
...


I think it's a misprint. Correct value is "3.0".

Update:
Sorry, my mistake.
This technology (Dynamic FSB Frequency Switching - DFFS) is for the reduction of energy
consumption - the system bus frequency is reduced by 2 times.

This post has been edited by cVad: Oct 4 2009, 01:54 PM
* 10.6.4 * iMac9,1 * E8400(3.6) * Palit GTS250 512Mb * ASUS P5B Deluxe ICH8R JMicron363 AD1988B * home *
jadran
post Oct 4 2009, 10:35 AM
Post #4
QUOTE (shoarthing @ Oct 4 2009, 08:25 AM) *
Thank you. Could you include the Atom family - specifically the useful 330?

Intel's compact *.pdf is here - Section 3.4over pps 17~18 seem to contain at least an outline of the information you need; but would appreciate if you would check.


I think atom 330 does not have speedstep.
Or I am wrong?
Superhai the Gre...
post Oct 4 2009, 01:19 PM
Post #5
QUOTE (cVad @ Oct 4 2009, 12:17 PM) *
I think it's a misprint. Correct value is "3.0".


No, it is using multiplier of 6. It achieves lower frequency with Intel's Dynamic FSB feature. 
I have decided to leave the OSX86 scene, but if there are any particularities you need to address arising from my contributions, you may contact me on voodoo@superhai.com
If you want to work on any of my sources or projects, or if you are interested in hosting the files I provided you may use the same address to arrange a way to get hold of them. Any requests or questions for unrelated support, or other unrelated requests will be blacklisted, banned or regarded as an abuse and dealt with accordingly. 
Of course community friends may also use the address and you know who you are.
Superhai the Gre...
post Oct 4 2009, 01:26 PM
Post #6
QUOTE (shoarthing @ Oct 4 2009, 10:25 AM) *
Thank you. Could you include the Atom family - specifically the useful 330?


Atom 330 doesn't support EST. 
I have decided to leave the OSX86 scene, but if there are any particularities you need to address arising from my contributions, you may contact me on voodoo@superhai.com
If you want to work on any of my sources or projects, or if you are interested in hosting the files I provided you may use the same address to arrange a way to get hold of them. Any requests or questions for unrelated support, or other unrelated requests will be blacklisted, banned or regarded as an abuse and dealt with accordingly. 
Of course community friends may also use the address and you know who you are.
youminbuluo
post Nov 29 2009, 07:10 AM
Post #7
QUOTE (Superhai the Great @ Oct 4 2009, 01:26 PM) *
Atom 330 doesn't support EST. 



i want to know the value of T5500. Vid0 and something else!
Slice
post Jan 24 2010, 04:34 PM
Post #8
Look Intel document 31674504.pdf for Intel Core2Duo mobile (T7500, T9300, etc)
Attached Image

Lower VID -> higher Voltage
Пожалуйста, прочитайте ЧаВо!
i3-2120 GA-H61M-S1, Radeon HD6670, ALC887(VoodooHDA 2.8.4), OS⌘10.9.1, OS⌘ 10.7.5 Clover FakeSMC_plugins_3.3.1 Realtek LAN v3.1.2

Fast Reply Add Reply Start a new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members: