Wednesday, November 17, 2004

Using x86info to Learn About HTT

Back in July I described my experiences running a June snapshot of FreeBSD 5-CURRENT on a Dell Poweredge 750 1U server. Recently I installed FreeBSD 5.3 RELEASE on the same hardware; here is the dmesg output for a kernel recompiled with SMP support.

While perusing the freebsd-current mailing list I came upon this thread which in part debated the merits of recompiling for SMP support on HTT machines. According to Intel:

"Hyper-Threading Technology, available on Intel Pentium 4 processors supporting Hyper-Threading Technology, is a form of simultaneous multithreading (SMT) that makes a single processor look like multiple processors to the operating system.

In its current implementation, HT Technology delivers two logical processors that can execute different tasks simultaneously using shared hardware resources. While an HT Technology–enabled processor doesn't equal the computing power of two physical processors, performance can be boosted with very little increase in cost and power consumption. For certain computing workloads, HT Technology improves performance with a much lower overhead of cost, power consumption, and space than the overhead required by multiple physical processors."

The poster in the freebsd-current thread needed a way to see how FreeBSD saw his HTT system. In other words, did it have "one" CPU or "two"? One reply suggested trying x86info, found in the ports tree.

Here is an example of how a true single CPU system appears to x86info:

bourque:/root# x86info
x86info v1.12b. Dave Jones 2001-2003
Feedback to .

Found 1 CPU
--------------------------------------------------------------------------
Family: 6 Model: 6 Stepping: 5 Type: 0 Brand: 0
CPU Model: Celeron (Mendocino) Original OEM
Instruction TLB: 4KB pages, 4-way associative, 32 entries
Instruction TLB: 4MB pages, fully associative, 2 entries
Data TLB: 4KB pages, 4-way associative, 64 entries
L2 unified cache:
Size: 128KB 4-way associative.
line size=32 bytes.
L1 Instruction cache:
Size: 16KB 4-way associative.
line size=32 bytes.
Data TLB: 4MB pages, 4-way associative, 8 entries
L1 Data cache:
Size: 16KB 4-way associative.
line size=32 bytes.

Here's how a true dual CPU system appears:

janney:/root# x86info
x86info v1.12b. Dave Jones 2001-2003
Feedback to .

Found 2 CPUs
--------------------------------------------------------------------------
CPU #1
/dev/cpu/0/cpuid: No such file or directory
Family: 6 Model: 7 Stepping: 3 Type: 0 Brand: 0
CPU Model: Pentium III (Katmai) [kC0] Original OEM
Instruction TLB: 4KB pages, 4-way associative, 32 entries
Instruction TLB: 4MB pages, fully associative, 2 entries
Data TLB: 4KB pages, 4-way associative, 64 entries
L2 unified cache:
Size: 512KB 4-way associative.
line size=32 bytes.
L1 Instruction cache:
Size: 16KB 4-way associative.
line size=32 bytes.
Data TLB: 4MB pages, 4-way associative, 8 entries
L1 Data cache:
Size: 16KB 4-way associative.
line size=32 bytes.
--------------------------------------------------------------------------
CPU #2
Family: 6 Model: 7 Stepping: 3 Type: 0 Brand: 0
CPU Model: Pentium III (Katmai) [kC0] Original OEM
Instruction TLB: 4KB pages, 4-way associative, 32 entries
Instruction TLB: 4MB pages, fully associative, 2 entries
Data TLB: 4KB pages, 4-way associative, 64 entries
L2 unified cache:
Size: 512KB 4-way associative.
line size=32 bytes.
L1 Instruction cache:
Size: 16KB 4-way associative.
line size=32 bytes.
Data TLB: 4MB pages, 4-way associative, 8 entries
L1 Data cache:
Size: 16KB 4-way associative.
line size=32 bytes.
--------------------------------------------------------------------------
WARNING: Detected SMP, but unable to access cpuid driver.
Used Uniprocessor CPU routines. Results inaccurate.

Here's how an HTT-enabled, single CPU system with a kernel recompiled for SMP appears:

fedorov:/root# x86info
x86info v1.12b. Dave Jones 2001-2003
Feedback to .

Found 2 CPUs
--------------------------------------------------------------------------
CPU #1
/dev/cpu/0/cpuid: No such file or directory
unknown TLB/cache descriptor: 0x60
Family: 15 Model: 3 Stepping: 4 Type: 0 Brand: 0
CPU Model: Unknown CPU Original OEM
Processor name string: Intel(R) Pentium(R) 4 CPU 2.80GHz

Instruction TLB: 4K, 2MB or 4MB pages, fully associative, 64 entries.
Data TLB: 4KB or 4MB pages, fully associative, 64 entries.
unknown TLB/cache descriptor: 0x60
No level 2 cache or no level 3 cache if valid 2nd level cache.
Instruction trace cache:
Size: 12K uOps 8-way associative.
L2 unified cache:
Size: 1MB Sectored, 8-way associative.
line size=64 bytes.
Processor serial: 0000-0F34-0000-0000-0000-0000
Number of logical processors supported within the physical package: 0

--------------------------------------------------------------------------
CPU #2
unknown TLB/cache descriptor: 0x60
Family: 15 Model: 3 Stepping: 4 Type: 0 Brand: 0
CPU Model: Unknown CPU Original OEM
Processor name string: Intel(R) Pentium(R) 4 CPU 2.80GHz

Instruction TLB: 4K, 2MB or 4MB pages, fully associative, 64 entries.
Data TLB: 4KB or 4MB pages, fully associative, 64 entries.
unknown TLB/cache descriptor: 0x60
No level 2 cache or no level 3 cache if valid 2nd level cache.
Instruction trace cache:
Size: 12K uOps 8-way associative.
L2 unified cache:
Size: 1MB Sectored, 8-way associative.
line size=64 bytes.
Processor serial: 0000-0F34-0000-0000-0000-0000
Number of logical processors supported within the physical package: 0

--------------------------------------------------------------------------
WARNING: Detected SMP, but unable to access cpuid driver.
Used Uniprocessor CPU routines. Results inaccurate.

Here's how the same hardware, with the default GENERIC kernel and no SMP, appears:

forsberg# x86info
x86info v1.12b. Dave Jones 2001-2003
Feedback to .

Found 1 CPU
--------------------------------------------------------------------------
unknown TLB/cache descriptor: 0x60
Family: 15 Model: 3 Stepping: 4 Type: 0 Brand: 0
CPU Model: Unknown CPU Original OEM
Processor name string: Intel(R) Pentium(R) 4 CPU 2.80GHz

Instruction TLB: 4K, 2MB or 4MB pages, fully associative, 64 entries.
Data TLB: 4KB or 4MB pages, fully associative, 64 entries.
unknown TLB/cache descriptor: 0x60
No level 2 cache or no level 3 cache if valid 2nd level cache.
Instruction trace cache:
Size: 12K uOps 8-way associative.
L2 unified cache:
Size: 1MB Sectored, 8-way associative.
line size=64 bytes.
Processor serial: 0000-0F34-0000-0000-0000-0000
Number of logical processors supported within the physical package: 0

There are two simple ways to know if your FreeBSD system is SMP-enabled. First, check top output. A single CPU box has no "C" column:

last pid: 30148; load averages: 0.00, 0.02, 0.00 up 9+04:05:35 22:05:56
22 processes: 1 running, 21 sleeping
CPU states: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle
Mem: 6304K Active, 127M Inact, 63M Wired, 60M Buf, 298M Free
Swap: 1024M Total, 1024M Free

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
405 root 96 0 3440K 2792K select 0:05 0.00% 0.00% sendmail
421 root 8 0 1356K 1044K nanslp 0:01 0.00% 0.00% cron
288 root 96 0 1312K 904K select 0:01 0.00% 0.00% syslogd

Compare that listing with the following on a HTT box with SMP enabled. At the time top was running, sendmail and cron used "CPU one" and syslogd used "CPU zero":

last pid: 49096; load averages: 0.00, 0.00, 0.00 up 9+05:33:40 22:06:50
31 processes: 1 running, 30 sleeping
CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
Mem: 10M Active, 152M Inact, 71M Wired, 16K Cache, 60M Buf, 261M Free
Swap: 1024M Total, 1024M Free

PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND
410 root 96 0 3460K 2800K select 1 0:08 0.00% 0.00% sendmail
426 root 8 0 1356K 1044K nanslp 1 0:02 0.00% 0.00% cron
291 root 96 0 1312K 868K select 0 0:01 0.00% 0.00% syslogd

On that same HTT SMP system, the dmesg output shows a "second CPU" launching:

SMP: AP CPU #1 Launched!

I intend to run my HTT systems with SMP enabled to see how they perform.

No comments: