Discussion:
Interprocess communication on same host
"Ron's Yahoo!" (Redacted sender "zlgonzalez-/E1597aS9LQAvxtiuMwx3w@public.gmane.org" for DMARC)
2014-09-05 02:13:46 UTC
Permalink
Hi,
If I were to use nanomsg to do interprocess communication on the same host, how would I do that in the most efficient manner?
And have we done some benchmarks on the performance difference of doing interprocess communication between processes on the same host vs different hosts? If it’s a different host, what is the fastest and reliable protocol that can be used?

Thanks,
Ron
Achille Roussel
2014-09-05 02:39:09 UTC
Permalink
You have 3 transport protocols in nanomsg, inproc, ilc and tcp.

- inproc: communication within a process
- ipc: communication for processes on the same host
- tcp: communication for processes on different hosts

tcp is going to be a bit slower than ipc, it goes through the network, there will be a higher latency than if the messages stay on the same host but I don’t understand how it would be useful to benchmark these two transports against each other, unless you plan on using tcp for host-local communication, in that case it depends on how efficient the named pipe and tcp stack are on the OS… but really with how simple nanomsg makes it to use one transport or the other you should really use the right tool for your use-case.

Or maybe I’m not understanding your question very well.
Post by "Ron's Yahoo!" (Redacted sender "zlgonzalez-/***@public.gmane.org" for DMARC)
Hi,
If I were to use nanomsg to do interprocess communication on the same host, how would I do that in the most efficient manner?
And have we done some benchmarks on the performance difference of doing interprocess communication between processes on the same host vs different hosts? If it’s a different host, what is the fastest and reliable protocol that can be used?
Thanks,
Ron
"Ron Gonzalez" (Redacted sender "zlgonzalez-/E1597aS9LQAvxtiuMwx3w@public.gmane.org" for DMARC)
2014-09-05 13:56:39 UTC
Permalink
Thanks Achille. Just curious why we don't use shared memory for IPC so we get the fastest implementation. I guess domain sockets is a lot easier to deal with and doesn't require locks?

Thanks,
Ron

Sent from my iPhone
Post by Achille Roussel
You have 3 transport protocols in nanomsg, inproc, ilc and tcp.
- inproc: communication within a process
- ipc: communication for processes on the same host
- tcp: communication for processes on different hosts
tcp is going to be a bit slower than ipc, it goes through the network, there will be a higher latency than if the messages stay on the same host but I don’t understand how it would be useful to benchmark these two transports against each other, unless you plan on using tcp for host-local communication, in that case it depends on how efficient the named pipe and tcp stack are on the OS… but really with how simple nanomsg makes it to use one transport or the other you should really use the right tool for your use-case.
Or maybe I’m not understanding your question very well.
Post by "Ron's Yahoo!" (Redacted sender "zlgonzalez-/***@public.gmane.org" for DMARC)
Hi,
If I were to use nanomsg to do interprocess communication on the same host, how would I do that in the most efficient manner?
And have we done some benchmarks on the performance difference of doing interprocess communication between processes on the same host vs different hosts? If it’s a different host, what is the fastest and reliable protocol that can be used?
Thanks,
Ron
Martin Sustrik
2014-09-05 14:21:13 UTC
Permalink
Post by "Ron Gonzalez" (Redacted sender "zlgonzalez-/***@public.gmane.org" for DMARC)
Thanks Achille. Just curious why we don't use shared memory for IPC
so we get the fastest implementation. I guess domain sockets is a
lot easier to deal with and doesn't require locks?
Even with shmem you need some way to signal the other process that
there's a message to be received. Whether done via IPC socket or some
other means it always takes ~6us.

Thus, the shmem only helps when the cost of copying the data is
non-trivial, i.e. for large messages, say 1MB or such.

Martin
"Ron's Yahoo!" (Redacted sender "zlgonzalez-/E1597aS9LQAvxtiuMwx3w@public.gmane.org" for DMARC)
2014-09-05 16:18:22 UTC
Permalink
Thanks Martin.
Wouldn’t we always be able to batch messages up such that we would always get > 1MB thereby resulting in larger net throughput? This assumes a non-realtime use case of course…

Thanks,
Ron
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by "Ron Gonzalez" (Redacted sender "zlgonzalez-/***@public.gmane.org" for DMARC)
Thanks Achille. Just curious why we don't use shared memory for IPC
so we get the fastest implementation. I guess domain sockets is a
lot easier to deal with and doesn't require locks?
Even with shmem you need some way to signal the other process that
there's a message to be received. Whether done via IPC socket or some
other means it always takes ‾6us.
Thus, the shmem only helps when the cost of copying the data is
non-trivial, i.e. for large messages, say 1MB or such.
Martin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJUCcbZAAoJENTpVjxCNN9YYrQIAJe60kSjolxYgk41Zl2pid/Q
+YnqGyY/CBniGF0cN7MaPiU5xv80urPofeDLIw14ogyCDhIB3cg30NP/7+73gnSy
06ShyQO+Tf1URx2oPEgWg9AD9rEMGnQdyt3jFS9w5uffsJDirtVm9afTvf5O8D8Y
WiF7EVcdVCFfg3QKapyw63M2i5Fjq+A197zpIppiOtqSLYAWVkyLFVfI/r8jHbj+
DtWGYlHIQGFIoeMMRr1R21DndpBE4Hnsc3dRlw9tZsaVtDyNTFdjmBPZkx6IVsi1
JllcT2z+nfTyxsQ5pesn8lMBv1CfBkFZ1/1U4YlUZTxOB7HqWI1kBX9qVC1/tjE=
=ntil
-----END PGP SIGNATURE-----
Martin Sustrik
2014-09-05 18:54:46 UTC
Permalink
Thanks Martin. Wouldn’t we always be able to batch messages up such
that we would always get > 1MB thereby resulting in larger net
throughput? This assumes a non-realtime use case of course…
In the current design you can do that on application level: Collect
messages for time T, then send the whole batch as a single message.

It doesn't work the other way round though: If nanomsg did extensive
batching it would not be suitable for real-time use cases.

That being said, shmem would still be valuable for transporting very
large messages between processes on the same box. Btw, all the
infrastructure to do that is already implemented, what's missing is
the actual piece of code for managing shmem.

Martin
"Ron Gonzalez" (Redacted sender "zlgonzalez-/E1597aS9LQAvxtiuMwx3w@public.gmane.org" for DMARC)
2014-09-05 21:30:20 UTC
Permalink
Sorry, but I don't understand. When you say that all the infrastructure
to do that is already implemented, what piece is implemented? Are you
referring to batching up messages?

Would it be valuable to look into contributing shmem management at a
lower level?

Thanks,
Ron
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Thanks Martin. Wouldn’t we always be able to batch messages up such
that we would always get > 1MB thereby resulting in larger net
throughput? This assumes a non-realtime use case of course…
In the current design you can do that on application level: Collect
messages for time T, then send the whole batch as a single message.
It doesn't work the other way round though: If nanomsg did extensive
batching it would not be suitable for real-time use cases.
That being said, shmem would still be valuable for transporting very
large messages between processes on the same box. Btw, all the
infrastructure to do that is already implemented, what's missing is
the actual piece of code for managing shmem.
Martin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJUCgbjAAoJENTpVjxCNN9Y9kcH/ifmT5nHU6iBh9k/Yd94CEq7
voB4oZV8FOo6QOHcWGL1pY5vSWZs4D/3dIMukfmsDhtsyr8oIO3iyY8jtdnH2Hv6
zABDUSlPpaHLQ4TvruQ+/ZMR3tB0ZN50wLK/orTHvHXNG0fhEqsDkcSjLZHLufDs
aMDUbO3Y0kAsow6lmdZ2xffoGwzrlkiLF9MmzIM86fREvmDyolqHblvK51nV1XlY
BWge6UfuYLmgJDsNWhkcvYq/z5StiFBXtPYrMWygm+RIcBgaBMiBcsCboi3ZeS/0
aNF8sz+InmyL9Arw/MBxQTda+Re01i1fjGVPsxfIE1A24+qZzaI18uCkA0kT00M=
=0t3u
-----END PGP SIGNATURE-----
Garrett D'Amore
2014-09-05 23:43:56 UTC
Permalink
In my experience, people vastly underestimate the performance of bcopy.

Unless you are passing around vast amounts of data (why are you using
something like nanomsg in that case, btw?) it simply doesn't pay off.
You lose the intended performance gains in the extra complexity and locking.

To make this work well (and this would not be portable outside of the
platform barring unusual measures like RDMA), you'd need a collaboration
layer, a very large shared memory region, and some kind of ring or consume
and produce indices in the buffer. Probably better to have two separate
buffers, one for each direction, with different MMU settings (cache
coherency).

This also becomes really fragile. A bug in one program can now take out
the other, unless you're very careful to treat the shared memory region
with the same kind of care that you do packet data. (i.e. don't pass
around program state, or pointers, etc.) Don't assume that the other side
won't trash the memory.

There may be some extreme cases where this complexity is worthwhile; you
*could* use nanomsg to do that. But again, why? I'd just map the data up,
and use POSIX signaling & mutexes to coordinate access. My guess is that
this will be simpler than trying to coordinate across a simulated network.

I have a hard time imagining that I'd want to forward data received from
this over some kind of device to other parties in the nanomsg
infrastructure, which is why I don't see much call to make this work with
nanomsg.
Sorry, but I don't understand. When you say that all the infrastructure to
do that is already implemented, what piece is implemented? Are you
referring to batching up messages?
Would it be valuable to look into contributing shmem management at a lower
level?
Thanks,
Ron
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Thanks Martin. Wouldn’t we always be able to batch messages up such
that we would always get > 1MB thereby resulting in larger net
throughput? This assumes a non-realtime use case of course

In the current design you can do that on application level: Collect
messages for time T, then send the whole batch as a single message.
It doesn't work the other way round though: If nanomsg did extensive
batching it would not be suitable for real-time use cases.
That being said, shmem would still be valuable for transporting very
large messages between processes on the same box. Btw, all the
infrastructure to do that is already implemented, what's missing is
the actual piece of code for managing shmem.
Martin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJUCgbjAAoJENTpVjxCNN9Y9kcH/ifmT5nHU6iBh9k/Yd94CEq7
voB4oZV8FOo6QOHcWGL1pY5vSWZs4D/3dIMukfmsDhtsyr8oIO3iyY8jtdnH2Hv6
zABDUSlPpaHLQ4TvruQ+/ZMR3tB0ZN50wLK/orTHvHXNG0fhEqsDkcSjLZHLufDs
aMDUbO3Y0kAsow6lmdZ2xffoGwzrlkiLF9MmzIM86fREvmDyolqHblvK51nV1XlY
BWge6UfuYLmgJDsNWhkcvYq/z5StiFBXtPYrMWygm+RIcBgaBMiBcsCboi3ZeS/0
aNF8sz+InmyL9Arw/MBxQTda+Re01i1fjGVPsxfIE1A24+qZzaI18uCkA0kT00M=
=0t3u
-----END PGP SIGNATURE-----
Martin Sustrik
2014-09-06 12:05:05 UTC
Permalink
Hi Garrett,
Post by Garrett D'Amore
In my experience, people vastly underestimate the performance of bcopy.
Unless you are passing around vast amounts of data (why are you
using something like nanomsg in that case, btw?) it simply doesn't
pay off. You lose the intended performance gains in the extra
complexity and locking.
To make this work well (and this would not be portable outside of
the platform barring unusual measures like RDMA), you'd need a
collaboration layer, a very large shared memory region, and some
kind of ring or consume and produce indices in the buffer.
Probably better to have two separate buffers, one for each
direction, with different MMU settings (cache coherency).
This also becomes really fragile. A bug in one program can now
take out the other, unless you're very careful to treat the shared
memory region with the same kind of care that you do packet data.
(i.e. don't pass around program state, or pointers, etc.) Don't
assume that the other side won't trash the memory.
There may be some extreme cases where this complexity is
worthwhile; you *could* use nanomsg to do that. But again, why?
I'd just map the data up, and use POSIX signaling & mutexes to
coordinate access. My guess is that this will be simpler than
trying to coordinate across a simulated network.
I have a hard time imagining that I'd want to forward data received
from this over some kind of device to other parties in the nanomsg
infrastructure, which is why I don't see much call to make this
work with nanomsg.
Agreed with all the above.

However, given there is a use case where one process allocates a large
chunk of memory (say 1GB) does some work on it, then passes it to
another processes et c. I can see no reason why nanomsg should not be
able to support that.

As already mentioned, most of the infrastructure is already in place:
nn_allocmsg() is already allocator-agnostic and so can be used to
allocate a message in shmem (say, for IPC message sizes above 1MB):

void *p = nn_allocmsg (2000000, NN_IPC);

Also, as you may recall, there is a "type" field in IPC protocol which
can be used to let the other party know that the message is coming
out-of-band, namely in shmem.

All that being said, note that I am not proposing to do ring buffers
et c. Just allocate very large messages as chunks of shmem and you are
done. (Ring buffers are also doable, but hardly worth it IMO.)

Ron, if you are interested in this stuff, feel free to implement it.
What you have to look at is how nn_allocmsg/nn_freemsg works, add
allocation of messages in shmem there, then modify the IPC transport
in such a way that it can transport shmem descriptors in addition to
the standard IPC bytesteam.

Martin
"Ron's Yahoo!" (Redacted sender "zlgonzalez-/E1597aS9LQAvxtiuMwx3w@public.gmane.org" for DMARC)
2014-09-06 19:57:29 UTC
Permalink
Thanks Martin and Garrett,
Will take a closer look at the code and see what I can do...

Thanks,
Ron
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Garrett,
Post by Garrett D'Amore
In my experience, people vastly underestimate the performance of bcopy.
Unless you are passing around vast amounts of data (why are you
using something like nanomsg in that case, btw?) it simply doesn't
pay off. You lose the intended performance gains in the extra
complexity and locking.
To make this work well (and this would not be portable outside of
the platform barring unusual measures like RDMA), you'd need a
collaboration layer, a very large shared memory region, and some
kind of ring or consume and produce indices in the buffer.
Probably better to have two separate buffers, one for each
direction, with different MMU settings (cache coherency).
This also becomes really fragile. A bug in one program can now
take out the other, unless you're very careful to treat the shared
memory region with the same kind of care that you do packet data.
(i.e. don't pass around program state, or pointers, etc.) Don't
assume that the other side won't trash the memory.
There may be some extreme cases where this complexity is
worthwhile; you *could* use nanomsg to do that. But again, why?
I'd just map the data up, and use POSIX signaling & mutexes to
coordinate access. My guess is that this will be simpler than
trying to coordinate across a simulated network.
I have a hard time imagining that I'd want to forward data received
from this over some kind of device to other parties in the nanomsg
infrastructure, which is why I don't see much call to make this
work with nanomsg.
Agreed with all the above.
However, given there is a use case where one process allocates a large
chunk of memory (say 1GB) does some work on it, then passes it to
another processes et c. I can see no reason why nanomsg should not be
able to support that.
nn_allocmsg() is already allocator-agnostic and so can be used to
void *p = nn_allocmsg (2000000, NN_IPC);
Also, as you may recall, there is a "type" field in IPC protocol which
can be used to let the other party know that the message is coming
out-of-band, namely in shmem.
All that being said, note that I am not proposing to do ring buffers
et c. Just allocate very large messages as chunks of shmem and you are
done. (Ring buffers are also doable, but hardly worth it IMO.)
Ron, if you are interested in this stuff, feel free to implement it.
What you have to look at is how nn_allocmsg/nn_freemsg works, add
allocation of messages in shmem there, then modify the IPC transport
in such a way that it can transport shmem descriptors in addition to
the standard IPC bytesteam.
Martin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJUCvhxAAoJENTpVjxCNN9YGe0IAJWeiXA+UgsFJEAexKXKOrO1
GubRU9WbMGtmYHo3IgKGFiEna+SUZVPp7QWKPYbsQzeOalKpEU5bx8Sdif69aYVq
0K+zTg5VAdhXkufXukbs9+x9IN45bUqGlTbMt/bsDwgohk17qRzHcyDhl35sZJxu
KowYIY8ATyVQA3BkjGGiKLRZ+jA1cdQUwYVqlY9hQMvMsZDHoJxNXdktzgi4UhYg
hrcCC6oCqIzs+4T2zYVPAFJyJQCXJIv7gR+cpifLL3lbNug75G651wZt6qaPRoIB
VDhRmOIrZnoC8+8mXK3Fmzle+KNZSBeqZBq7EjPIvgXJu1QY5Ku1AWuBuyzIjo4=
=+fpz
-----END PGP SIGNATURE-----
Garrett D'Amore
2014-09-06 22:11:04 UTC
Permalink
Without some kind of collaboration layer so that parties understand which addresses are in use and how it seems kind of useless to me. All the hard work still lives in the application. This compares poorly to other nanomsg transports.

If you want to map up a bunch of data and use nanomsg to send control data only and use shmem for data then just do that. You don't have to do anything more than you would if you were going to try to have nanomsg handle this for you.

As always, KISS.

A big part of nanomsgs attraction is that it is simple and easy to build fault tolerant distributed architectures. Using shared memory flies in the face of that.

Sent from my iPhone
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Garrett,
Post by Garrett D'Amore
In my experience, people vastly underestimate the performance of bcopy.
Unless you are passing around vast amounts of data (why are you
using something like nanomsg in that case, btw?) it simply doesn't
pay off. You lose the intended performance gains in the extra
complexity and locking.
To make this work well (and this would not be portable outside of
the platform barring unusual measures like RDMA), you'd need a
collaboration layer, a very large shared memory region, and some
kind of ring or consume and produce indices in the buffer.
Probably better to have two separate buffers, one for each
direction, with different MMU settings (cache coherency).
This also becomes really fragile. A bug in one program can now
take out the other, unless you're very careful to treat the shared
memory region with the same kind of care that you do packet data.
(i.e. don't pass around program state, or pointers, etc.) Don't
assume that the other side won't trash the memory.
There may be some extreme cases where this complexity is
worthwhile; you *could* use nanomsg to do that. But again, why?
I'd just map the data up, and use POSIX signaling & mutexes to
coordinate access. My guess is that this will be simpler than
trying to coordinate across a simulated network.
I have a hard time imagining that I'd want to forward data received
from this over some kind of device to other parties in the nanomsg
infrastructure, which is why I don't see much call to make this
work with nanomsg.
Agreed with all the above.
However, given there is a use case where one process allocates a large
chunk of memory (say 1GB) does some work on it, then passes it to
another processes et c. I can see no reason why nanomsg should not be
able to support that.
nn_allocmsg() is already allocator-agnostic and so can be used to
void *p = nn_allocmsg (2000000, NN_IPC);
Also, as you may recall, there is a "type" field in IPC protocol which
can be used to let the other party know that the message is coming
out-of-band, namely in shmem.
All that being said, note that I am not proposing to do ring buffers
et c. Just allocate very large messages as chunks of shmem and you are
done. (Ring buffers are also doable, but hardly worth it IMO.)
Ron, if you are interested in this stuff, feel free to implement it.
What you have to look at is how nn_allocmsg/nn_freemsg works, add
allocation of messages in shmem there, then modify the IPC transport
in such a way that it can transport shmem descriptors in addition to
the standard IPC bytesteam.
Martin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJUCvhxAAoJENTpVjxCNN9YGe0IAJWeiXA+UgsFJEAexKXKOrO1
GubRU9WbMGtmYHo3IgKGFiEna+SUZVPp7QWKPYbsQzeOalKpEU5bx8Sdif69aYVq
0K+zTg5VAdhXkufXukbs9+x9IN45bUqGlTbMt/bsDwgohk17qRzHcyDhl35sZJxu
KowYIY8ATyVQA3BkjGGiKLRZ+jA1cdQUwYVqlY9hQMvMsZDHoJxNXdktzgi4UhYg
hrcCC6oCqIzs+4T2zYVPAFJyJQCXJIv7gR+cpifLL3lbNug75G651wZt6qaPRoIB
VDhRmOIrZnoC8+8mXK3Fmzle+KNZSBeqZBq7EjPIvgXJu1QY5Ku1AWuBuyzIjo4=
=+fpz
-----END PGP SIGNATURE-----
Alex Elsayed
2014-09-07 18:48:19 UTC
Permalink
On future versions of Linux there's also the option of using memfds, which
went in for 3.17 - memfds allow you to allocate an in-memory anonymous file,
write to it, and 'seal' it to lock it against changes. The intent is
explicitly that these be used for IPC, via the FD passing mechanism.

In doing testing to find the performance turnover point relative to passing
byte arrays through kdbus (which does exactly two context switches on a one-
way message), the developers found that the value where memfds outperformed
straight copy in 1-to-1 communication was 512KB, and surprisingly was the
same across platforms - x86, ARM, x86_64, etc.

See https://dvdhrm.wordpress.com/tag/memfd/ for more info.
Post by Garrett D'Amore
Without some kind of collaboration layer so that parties understand which
addresses are in use and how it seems kind of useless to me. All the hard
work still lives in the application. This compares poorly to other
nanomsg transports.
If you want to map up a bunch of data and use nanomsg to send control data
only and use shmem for data then just do that. You don't have to do
anything more than you would if you were going to try to have nanomsg
handle this for you.
As always, KISS.
A big part of nanomsgs attraction is that it is simple and easy to build
fault tolerant distributed architectures. Using shared memory flies in
the face of that.
Sent from my iPhone
On Sep 6, 2014, at 5:05 AM, Martin Sustrik
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Garrett,
Post by Garrett D'Amore
In my experience, people vastly underestimate the performance of bcopy.
Unless you are passing around vast amounts of data (why are you
using something like nanomsg in that case, btw?) it simply doesn't
pay off. You lose the intended performance gains in the extra
complexity and locking.
To make this work well (and this would not be portable outside of
the platform barring unusual measures like RDMA), you'd need a
collaboration layer, a very large shared memory region, and some
kind of ring or consume and produce indices in the buffer.
Probably better to have two separate buffers, one for each
direction, with different MMU settings (cache coherency).
This also becomes really fragile. A bug in one program can now
take out the other, unless you're very careful to treat the shared
memory region with the same kind of care that you do packet data.
(i.e. don't pass around program state, or pointers, etc.) Don't
assume that the other side won't trash the memory.
There may be some extreme cases where this complexity is
worthwhile; you *could* use nanomsg to do that. But again, why?
I'd just map the data up, and use POSIX signaling & mutexes to
coordinate access. My guess is that this will be simpler than
trying to coordinate across a simulated network.
I have a hard time imagining that I'd want to forward data received
from this over some kind of device to other parties in the nanomsg
infrastructure, which is why I don't see much call to make this
work with nanomsg.
Agreed with all the above.
However, given there is a use case where one process allocates a large
chunk of memory (say 1GB) does some work on it, then passes it to
another processes et c. I can see no reason why nanomsg should not be
able to support that.
nn_allocmsg() is already allocator-agnostic and so can be used to
void *p = nn_allocmsg (2000000, NN_IPC);
Also, as you may recall, there is a "type" field in IPC protocol which
can be used to let the other party know that the message is coming
out-of-band, namely in shmem.
All that being said, note that I am not proposing to do ring buffers
et c. Just allocate very large messages as chunks of shmem and you are
done. (Ring buffers are also doable, but hardly worth it IMO.)
Ron, if you are interested in this stuff, feel free to implement it.
What you have to look at is how nn_allocmsg/nn_freemsg works, add
allocation of messages in shmem there, then modify the IPC transport
in such a way that it can transport shmem descriptors in addition to
the standard IPC bytesteam.
Martin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJUCvhxAAoJENTpVjxCNN9YGe0IAJWeiXA+UgsFJEAexKXKOrO1
GubRU9WbMGtmYHo3IgKGFiEna+SUZVPp7QWKPYbsQzeOalKpEU5bx8Sdif69aYVq
0K+zTg5VAdhXkufXukbs9+x9IN45bUqGlTbMt/bsDwgohk17qRzHcyDhl35sZJxu
KowYIY8ATyVQA3BkjGGiKLRZ+jA1cdQUwYVqlY9hQMvMsZDHoJxNXdktzgi4UhYg
hrcCC6oCqIzs+4T2zYVPAFJyJQCXJIv7gR+cpifLL3lbNug75G651wZt6qaPRoIB
VDhRmOIrZnoC8+8mXK3Fmzle+KNZSBeqZBq7EjPIvgXJu1QY5Ku1AWuBuyzIjo4=
=+fpz
-----END PGP SIGNATURE-----
Garrett D'Amore
2014-09-08 03:04:48 UTC
Permalink
I'm not surprised. As I said. Copying is fast.

Sent from my iPhone
Post by Alex Elsayed
On future versions of Linux there's also the option of using memfds, which
went in for 3.17 - memfds allow you to allocate an in-memory anonymous file,
write to it, and 'seal' it to lock it against changes. The intent is
explicitly that these be used for IPC, via the FD passing mechanism.
In doing testing to find the performance turnover point relative to passing
byte arrays through kdbus (which does exactly two context switches on a one-
way message), the developers found that the value where memfds outperformed
straight copy in 1-to-1 communication was 512KB, and surprisingly was the
same across platforms - x86, ARM, x86_64, etc.
See https://dvdhrm.wordpress.com/tag/memfd/ for more info.
Post by Garrett D'Amore
Without some kind of collaboration layer so that parties understand which
addresses are in use and how it seems kind of useless to me. All the hard
work still lives in the application. This compares poorly to other
nanomsg transports.
If you want to map up a bunch of data and use nanomsg to send control data
only and use shmem for data then just do that. You don't have to do
anything more than you would if you were going to try to have nanomsg
handle this for you.
As always, KISS.
A big part of nanomsgs attraction is that it is simple and easy to build
fault tolerant distributed architectures. Using shared memory flies in
the face of that.
Sent from my iPhone
On Sep 6, 2014, at 5:05 AM, Martin Sustrik
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Garrett,
Post by Garrett D'Amore
In my experience, people vastly underestimate the performance of bcopy.
Unless you are passing around vast amounts of data (why are you
using something like nanomsg in that case, btw?) it simply doesn't
pay off. You lose the intended performance gains in the extra
complexity and locking.
To make this work well (and this would not be portable outside of
the platform barring unusual measures like RDMA), you'd need a
collaboration layer, a very large shared memory region, and some
kind of ring or consume and produce indices in the buffer.
Probably better to have two separate buffers, one for each
direction, with different MMU settings (cache coherency).
This also becomes really fragile. A bug in one program can now
take out the other, unless you're very careful to treat the shared
memory region with the same kind of care that you do packet data.
(i.e. don't pass around program state, or pointers, etc.) Don't
assume that the other side won't trash the memory.
There may be some extreme cases where this complexity is
worthwhile; you *could* use nanomsg to do that. But again, why?
I'd just map the data up, and use POSIX signaling & mutexes to
coordinate access. My guess is that this will be simpler than
trying to coordinate across a simulated network.
I have a hard time imagining that I'd want to forward data received
from this over some kind of device to other parties in the nanomsg
infrastructure, which is why I don't see much call to make this
work with nanomsg.
Agreed with all the above.
However, given there is a use case where one process allocates a large
chunk of memory (say 1GB) does some work on it, then passes it to
another processes et c. I can see no reason why nanomsg should not be
able to support that.
nn_allocmsg() is already allocator-agnostic and so can be used to
void *p = nn_allocmsg (2000000, NN_IPC);
Also, as you may recall, there is a "type" field in IPC protocol which
can be used to let the other party know that the message is coming
out-of-band, namely in shmem.
All that being said, note that I am not proposing to do ring buffers
et c. Just allocate very large messages as chunks of shmem and you are
done. (Ring buffers are also doable, but hardly worth it IMO.)
Ron, if you are interested in this stuff, feel free to implement it.
What you have to look at is how nn_allocmsg/nn_freemsg works, add
allocation of messages in shmem there, then modify the IPC transport
in such a way that it can transport shmem descriptors in addition to
the standard IPC bytesteam.
Martin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJUCvhxAAoJENTpVjxCNN9YGe0IAJWeiXA+UgsFJEAexKXKOrO1
GubRU9WbMGtmYHo3IgKGFiEna+SUZVPp7QWKPYbsQzeOalKpEU5bx8Sdif69aYVq
0K+zTg5VAdhXkufXukbs9+x9IN45bUqGlTbMt/bsDwgohk17qRzHcyDhl35sZJxu
KowYIY8ATyVQA3BkjGGiKLRZ+jA1cdQUwYVqlY9hQMvMsZDHoJxNXdktzgi4UhYg
hrcCC6oCqIzs+4T2zYVPAFJyJQCXJIv7gR+cpifLL3lbNug75G651wZt6qaPRoIB
VDhRmOIrZnoC8+8mXK3Fmzle+KNZSBeqZBq7EjPIvgXJu1QY5Ku1AWuBuyzIjo4=
=+fpz
-----END PGP SIGNATURE-----
Alex Elsayed
2014-09-08 05:26:40 UTC
Permalink
I'm not surprised by the number being high, I'm surprised by it being
_consistent_ - usually, you _don't_ see that kind of similarity across
embedded ARM boards, normal desktops and laptops, and beefy servers.

I would have expected, for instance, that the turnover point might be lower
on embedded because of generally poorer memory bandwidth. Or that it might
be higher on multi-CPU servers because NUMA overhead for nonlocal access
might dominate in reading from a memfd.
Post by Garrett D'Amore
I'm not surprised. As I said. Copying is fast.
Sent from my iPhone
On Sep 7, 2014, at 11:48 AM, Alex Elsayed
On future versions of Linux there's also the option of using memfds,
which went in for 3.17 - memfds allow you to allocate an in-memory
anonymous file, write to it, and 'seal' it to lock it against changes.
The intent is explicitly that these be used for IPC, via the FD passing
mechanism.
In doing testing to find the performance turnover point relative to
passing byte arrays through kdbus (which does exactly two context
switches on a one- way message), the developers found that the value
where memfds outperformed straight copy in 1-to-1 communication was
512KB, and surprisingly was the same across platforms - x86, ARM, x86_64,
etc.
See https://dvdhrm.wordpress.com/tag/memfd/ for more info.
Post by Garrett D'Amore
Without some kind of collaboration layer so that parties understand which
addresses are in use and how it seems kind of useless to me. All the hard
work still lives in the application. This compares poorly to other
nanomsg transports.
If you want to map up a bunch of data and use nanomsg to send control data
only and use shmem for data then just do that. You don't have to do
anything more than you would if you were going to try to have nanomsg
handle this for you.
As always, KISS.
A big part of nanomsgs attraction is that it is simple and easy to build
fault tolerant distributed architectures. Using shared memory flies in
the face of that.
Sent from my iPhone
On Sep 6, 2014, at 5:05 AM, Martin Sustrik
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Garrett,
Post by Garrett D'Amore
In my experience, people vastly underestimate the performance of bcopy.
Unless you are passing around vast amounts of data (why are you
using something like nanomsg in that case, btw?) it simply doesn't
pay off. You lose the intended performance gains in the extra
complexity and locking.
To make this work well (and this would not be portable outside of
the platform barring unusual measures like RDMA), you'd need a
collaboration layer, a very large shared memory region, and some
kind of ring or consume and produce indices in the buffer.
Probably better to have two separate buffers, one for each
direction, with different MMU settings (cache coherency).
This also becomes really fragile. A bug in one program can now
take out the other, unless you're very careful to treat the shared
memory region with the same kind of care that you do packet data.
(i.e. don't pass around program state, or pointers, etc.) Don't
assume that the other side won't trash the memory.
There may be some extreme cases where this complexity is
worthwhile; you *could* use nanomsg to do that. But again, why?
I'd just map the data up, and use POSIX signaling & mutexes to
coordinate access. My guess is that this will be simpler than
trying to coordinate across a simulated network.
I have a hard time imagining that I'd want to forward data received
from this over some kind of device to other parties in the nanomsg
infrastructure, which is why I don't see much call to make this
work with nanomsg.
Agreed with all the above.
However, given there is a use case where one process allocates a large
chunk of memory (say 1GB) does some work on it, then passes it to
another processes et c. I can see no reason why nanomsg should not be
able to support that.
nn_allocmsg() is already allocator-agnostic and so can be used to
void *p = nn_allocmsg (2000000, NN_IPC);
Also, as you may recall, there is a "type" field in IPC protocol which
can be used to let the other party know that the message is coming
out-of-band, namely in shmem.
All that being said, note that I am not proposing to do ring buffers
et c. Just allocate very large messages as chunks of shmem and you are
done. (Ring buffers are also doable, but hardly worth it IMO.)
Ron, if you are interested in this stuff, feel free to implement it.
What you have to look at is how nn_allocmsg/nn_freemsg works, add
allocation of messages in shmem there, then modify the IPC transport
in such a way that it can transport shmem descriptors in addition to
the standard IPC bytesteam.
Martin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJUCvhxAAoJENTpVjxCNN9YGe0IAJWeiXA+UgsFJEAexKXKOrO1
GubRU9WbMGtmYHo3IgKGFiEna+SUZVPp7QWKPYbsQzeOalKpEU5bx8Sdif69aYVq
0K+zTg5VAdhXkufXukbs9+x9IN45bUqGlTbMt/bsDwgohk17qRzHcyDhl35sZJxu
KowYIY8ATyVQA3BkjGGiKLRZ+jA1cdQUwYVqlY9hQMvMsZDHoJxNXdktzgi4UhYg
hrcCC6oCqIzs+4T2zYVPAFJyJQCXJIv7gR+cpifLL3lbNug75G651wZt6qaPRoIB
VDhRmOIrZnoC8+8mXK3Fmzle+KNZSBeqZBq7EjPIvgXJu1QY5Ku1AWuBuyzIjo4=
=+fpz
-----END PGP SIGNATURE-----
Loading...