Modern x86 processors (Athlon, Pentium4) have split L1 data and instruction caches. What I'm trying to accomplish is that instructions are fetched from the L1 cache (the code is small enough to fit in), but that data fetches bypass the cache hierarchy and directly access memory. Even the code currently being executed from the L1 instruction cache should be read directly from memory if accessed by a memory-referencing instruction. (There are no writes, just reads).
This seems impossible to achieve on the AMD/Intel architecture. I tried several combinations with CR0.CD and MTRR settings and I can have one of the two extremes: 1) both code and data are served from the cache if found there, or 2) both code and data read directly from memory (and this includes every instruction executed).
How smart do you have to be to split L1 I&D caches, but not to provide separate control over them?
Tags: amd assembler intel