Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm sorry I don't have time at the moment to go deeper on this but I'll add two things that may be useful for the reader:

1. The Intel Optimization Reference, under Section 3.5.1, advises:

"Favor single-micro-operation instructions."

"Avoid using complex instructions (for example, enter, leave, or loop) that have more than 4 micro-ops and require multiple cycles to decode. Use sequences of simple instructions instead."

2. By surface I meant literally silicon surface area. That and transistor count (and other aspects like fan-out and clock domains etc) are the major aspects you trade when engineering a CPU.

If you need larger microcode ROM to store larger microprograms you also need more bits to address into the microcode ROM and that makes microcode programs even larger! All this consumes surface area and transistors that could be devoted to something else.

Sure, if you could fit the binary search instruction microcode in the existing spare space of the microcode ROM you wouldn't have that problem but you'd still be competing with other potential use cases of those alleged "microcode speedups". What about a UTF-8 string length instruction, would that be more important? Etc



>"1. The Intel Optimization Reference, under Section 3.5.1, advises:

'Favor single-micro-operation instructions.'

'Avoid using complex instructions (for example, enter, leave, or loop) that have more than 4 micro-ops and require multiple cycles to decode. Use sequences of simple instructions instead.'"

Do you know WHY exactly it says that? Please explain.

>2. By surface I meant literally silicon surface area. That and transistor count (and other aspects like fan-out and clock domains etc) are the major aspects you trade when engineering a CPU.

How is hardware transistor count, fan-out, and clock domains, etc. relative to adding to/changing the software microcode instructions for a CPU via a software microcode patch?

>If you need larger microcode ROM to store larger microprograms you also need more bits to address into the microcode ROM and that makes microcode programs even larger! All this consumes surface area and transistors that could be devoted to something else.

What CPU are we talking about?

>"Sure, if you could fit the binary search instruction microcode in the existing spare space of the microcode ROM"

This is what we are talking about.

>"you wouldn't have that problem"

No you wouldn't.

>"but you'd still be competing with other potential use cases of those alleged "microcode speedups". What about a UTF-8 string length instruction, would that be more important? Etc"

What are all of the "other potential use cases"?

How would adding an additional microcode instruction to a CPU that had room for it compete with all of the "other potential use cases"?

What exactly do you mean by "UTF-8 string length instruction"?

What exactly do you mean by "string length instruction"?

?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: