Re: API change for bus_dma
From: Terry Lambert (tlambert2_at_mindspring.com)
Date: 06/28/03
- Previous message: John-Mark Gurney: "Re: API change for bus_dma"
- In reply to: John-Mark Gurney: "Re: API change for bus_dma"
- Next in thread: Andrew Gallatin: "Re: API change for bus_dma"
- Reply: Andrew Gallatin: "Re: API change for bus_dma"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Date: Sat, 28 Jun 2003 01:52:27 -0700 To: John-Mark Gurney <gurney_j@efn.org>
John-Mark Gurney wrote:
> I'm sorry, no, this will not solve the problem he is talking about.
> You need to reread the information that Andrew has provided before.
> In a previous email you got confused on the STREAMING/COHERENT flag's
> meaning. Using contigmalloc only gives you a linear address space,
> but does not guarantee that the processor will snoop the memory write
> cycles by the bridge or device to keep the cache of the cpu the same
> with the memory. For what Andrew needs, he needs the processor to have
> the same information as in memory. On multiprocessor systems, it can
> get expensive if every processor has to snoop every memory write that
> happens.
Clearly, I don't have a deep understanding of SPARC64 SMP hardware;
given what he was saying, it still looks to me that the issue he
was attempting to address was related to whether or not the memory
in question was physically vs. logically contiguous:
<http://docs.freebsd.org/cgi/getmsg.cgi?fetch=153914+0+current/freebsd-arch>
It also still looks to me that the use of "cache coherent" in:
<http://docs.freebsd.org/cgi/getmsg.cgi?fetch=138580+0+current/freebsd-arch>
was referring to user space memory and device memory, and not the
processor cache. Reading the Solaris ddi_dma_sync(9) man page:
Doesn't change that impression for me (it mentions that explicitly
calling the function may result in cache flushes, but doesn't imply
snooping will occur).
There's a good programming article on "Writing Device Drivers" in
the "Sun Product Documentation" online:
<http://docs.sun.com/db/doc/802-5900/6i9kj7oq6?a=view>
That discusses this in detail, and which seems (to me) to be
authoritative.
I'd be happy to be corrected, but if you're going to correct me,
please tell me *why* I'm wrong, instead of just telling me *that*
I'm wrong, since I really *am* interested in not being wrong for
the same root cause in the future.
Thanks,
-- Terry
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
- Previous message: John-Mark Gurney: "Re: API change for bus_dma"
- In reply to: John-Mark Gurney: "Re: API change for bus_dma"
- Next in thread: Andrew Gallatin: "Re: API change for bus_dma"
- Reply: Andrew Gallatin: "Re: API change for bus_dma"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Relevant Pages
- Re: Cached memory never gets released
... Stock linux 2.4.26 kernel. ... Due to flash bug 3M of memory gets lost
due to font memory getting lost ... The output of "free" cache number steadily grows. ...
longer to exhaust all of system memory with the cache. ... (Linux-Kernel) - Re: Problem: Creating a raw binary string
... > While its true that a 64-bit cpu will move twice the data per instruction it ...
> Memory bus width plays an important role here and unless it too is widened / ... You
are forgetting the two levels of cache in the processor. ... The memory chips are
addressed in Row col fashion. ... (alt.comp.lang.borland-delphi) - Re: Is Greenspun enough?
... Most OSes memory map executables directly from the file system so code doesn't pollute
the file cache or swap space. ... (comp.lang.lisp) - Re: Superstitious learning in Computer Architecture
... Without a LOT of logic or some other better approach, re-executing the instructions requires
re-decoding and it ties up the cache memory bus transferring more data as instructions than the
instructions are working on. ... The concept of cache is fundamentally flawed in that it STILL
restricts access to one word per clock cycle, when a single modern ALU can easily use 5 plus whatever
is eaten up with instruction accesses. ... The size of an optimizing compiler is proportional
to the SQUARE of the size of the language times the SQUARE of the complexity of the machine -
because all interactions must be considered. ... (comp.arch.arithmetic) - Re: FPGA-based hardware accelerator for PC
... I know that in most cases the CPU ... that it contsins no cache, as BRAMs
are too precious resources to be wasted ... The BRAMs are what define the opportunity,
... many threads with full associativity of memory lines using hashed MMU ... (comp.arch.fpga)