Discussion:
syncrepl multicast MMR
Howard Chu
2015-02-08 12:52:40 UTC
Permalink
Content preview: Been thinking this would be worth trying for a while now.
Set a config option for syncprov to send Persist messages to a multicast group
instead of the original TCP session. All the consumers would also join the
group and listen for updates. This would also exercise the cldap:// support
in libldap. [...]

Content analysis details: (-1.9 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]

Been thinking this would be worth trying for a while now. Set a config
option for syncprov to send Persist messages to a multicast group
instead of the original TCP session. All the consumers would also join
the group and listen for updates. This would also exercise the cldap://
support in libldap.

Implementation details: since datagrams are unreliable, we need to
include sequence numbers on each message, which the consumer can check
to make sure it hasn't missed an update. Moreover, it should be able to
send a request to the provider to resend (over the TCP session) the
message corresponding to a given sequence number.

(Currently I envision using a small circular array in the provider to
remember the last N messages for potential retransmit.)

Config: both consumer and provider will need to be configured with a
particular multicast group ID. It should be possible to participate in
more than 1 group at a time (in which case, an update must be explicitly
sent to each active group) but in general, I expect a cluster of
cooperating MMR servers to all use a single multicast group, and so any
update will only need to be forwarded to the network once.

Thoughts?
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Emmanuel Lécharny
2015-02-09 06:06:17 UTC
Permalink
Content preview: Le 09/02/15 05:15, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Le 08/02/15 13:52, Howard Chu a écrit : >>> Been thinking this
would be worth trying for a while now. Set a config >>> option for syncprov
to send Persist messages to a multicast group >>> instead of the original
TCP session. All the consumers would also join >>> the group and listen for
updates. This would also exercise the >>> cldap:// support in libldap. >>>
Implementation details: since datagrams are unreliable, we need to >>>
include sequence numbers on each message, which the consumer can check >>>
to make sure it hasn't missed an update. Moreover, it should be able >>>
to send a request to the provider to resend (over the TCP session) the >>>
message corresponding to a given sequence number. >> >> Ok but how do you
detect that a consumer has missed an update, if no >> other update occurs
? You may have some desunchronized server for quite >> a long period of time
if you don't have a mechinism for the consumer to >> regularly check if it
is up to date. > > Good point, but easily solved with a periodic keepalive
msg. A heart-beat would be good to have : the producer would periodically
multi-cast the latest CSN, allowing desynchronized servers to catch up. [...]


Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[209.85.212.171 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -2.7 (--)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Le 09/02/15 05:15, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Le 08/02/15 13:52, Howard Chu a écrit : >>> Been thinking this
would be worth trying for a while now. Set a config >>> option for syncprov
to send Persist messages to a multicast group >>> instead of the original
TCP session. All the consumers would also join >>> the group and listen for
updates. This would also exercise the >>> cldap:// support in libldap. >>>
Implementation details: since datagrams are unreliable, we need to >>>
include sequence numbers on each message, which the consumer can check >>>
to make sure it hasn't missed an update. Moreover, it should be able >>>
to send a request to the provider to resend (over the TCP session) the >>>
message corresponding to a given sequence number. >> >> Ok but how do you
detect that a consumer has missed an update, if no >> other update occurs
? You may have some desunchronized server for quite >> a long period of time
if you don't have a mechinism for the consumer to >> regularly check if it
is up to date. > > Good point, but easily solved with a periodic keepalive
msg. A heart-beat would be good to have : the producer would periodically
multi-cast the latest CSN, allowing desynchronized servers to catch up. [...]


Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[209.85.212.171 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
Been thinking this would be worth trying for a while now. Set a config
option for syncprov to send Persist messages to a multicast group
instead of the original TCP session. All the consumers would also join
the group and listen for updates. This would also exercise the
cldap:// support in libldap.
Implementation details: since datagrams are unreliable, we need to
include sequence numbers on each message, which the consumer can check
to make sure it hasn't missed an update. Moreover, it should be able
to send a request to the provider to resend (over the TCP session) the
message corresponding to a given sequence number.
Ok but how do you detect that a consumer has missed an update, if no
other update occurs ? You may have some desunchronized server for quite
a long period of time if you don't have a mechinism for the consumer to
regularly check if it is up to date.
Good point, but easily solved with a periodic keepalive msg.
A heart-beat would be good to have : the producer would periodically
multi-cast the latest CSN, allowing desynchronized servers to catch up.

Another pb iw that Datagram are limited in size, which means big entries
will have to be split in many parts.
Emmanuel Lécharny
2015-02-09 06:22:07 UTC
Permalink
Content preview: Le 09/02/15 05:15, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Le 08/02/15 13:52, Howard Chu a écrit : >>> Been thinking this
would be worth trying for a while now. Set a config >>> option for syncprov
to send Persist messages to a multicast group >>> instead of the original
TCP session. All the consumers would also join >>> the group and listen for
updates. This would also exercise the >>> cldap:// support in libldap. >>>
Implementation details: since datagrams are unreliable, we need to >>>
include sequence numbers on each message, which the consumer can check >>>
to make sure it hasn't missed an update. Moreover, it should be able >>>
to send a request to the provider to resend (over the TCP session) the >>>
message corresponding to a given sequence number. >> >> Ok but how do you
detect that a consumer has missed an update, if no >> other update occurs
? You may have some desunchronized server for quite >> a long period of time
if you don't have a mechinism for the consumer to >> regularly check if it
is up to date. > > Good point, but easily solved with a periodic keepalive
msg. [...]

Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[209.85.212.172 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: ietf.org]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -2.7 (--)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Le 09/02/15 05:15, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Le 08/02/15 13:52, Howard Chu a écrit : >>> Been thinking this
would be worth trying for a while now. Set a config >>> option for syncprov
to send Persist messages to a multicast group >>> instead of the original
TCP session. All the consumers would also join >>> the group and listen for
updates. This would also exercise the >>> cldap:// support in libldap. >>>
Implementation details: since datagrams are unreliable, we need to >>>
include sequence numbers on each message, which the consumer can check >>>
to make sure it hasn't missed an update. Moreover, it should be able >>>
to send a request to the provider to resend (over the TCP session) the >>>
message corresponding to a given sequence number. >> >> Ok but how do you
detect that a consumer has missed an update, if no >> other update occurs
? You may have some desunchronized server for quite >> a long period of time
if you don't have a mechinism for the consumer to >> regularly check if it
is up to date. > > Good point, but easily solved with a periodic keepalive
msg. [...]

Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[209.85.212.172 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: ietf.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
Been thinking this would be worth trying for a while now. Set a config
option for syncprov to send Persist messages to a multicast group
instead of the original TCP session. All the consumers would also join
the group and listen for updates. This would also exercise the
cldap:// support in libldap.
Implementation details: since datagrams are unreliable, we need to
include sequence numbers on each message, which the consumer can check
to make sure it hasn't missed an update. Moreover, it should be able
to send a request to the provider to resend (over the TCP session) the
message corresponding to a given sequence number.
Ok but how do you detect that a consumer has missed an update, if no
other update occurs ? You may have some desunchronized server for quite
a long period of time if you don't have a mechinism for the consumer to
regularly check if it is up to date.
Good point, but easily solved with a periodic keepalive msg.
One more thing : you will have to deal with TLS at some point. There is
a RFC draft
(https://tools.ietf.org/html/draft-keoh-tls-multicast-security-00) that
proposes something, it seems to be 3 years old, and not active anymore.
Andrew Findlay
2015-02-08 19:18:25 UTC
Permalink
Been thinking this would be worth trying for a while now. Set a > config
option for syncprov to send Persist messages to a multicast > group instead
of the original TCP session. All the consumers would > also join the group
and listen for updates. This would also exercise > the cldap:// support in
libldap. [...]

Content analysis details: (-3.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[194.106.223.201 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: skills-1st.co.uk]
-0.0 SPF_HELO_PASS SPF: HELO matches SPF record
1.0 DATE_IN_PAST_12_24 Date: is 12 to 24 hours before Received: date
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Been thinking this would be worth trying for a while now. Set a
config option for syncprov to send Persist messages to a multicast
group instead of the original TCP session. All the consumers would
also join the group and listen for updates. This would also exercise
the cldap:// support in libldap.
It certainly makes sense to have the network do more of the work
where it can. This would be particularly valuable in the high fan-out
case.

Encryption and message signing needs some thought - this is usually
harder to get right in datagram protocols than in streams.

While we are talking datagrams and multicast, have you looked at Fountain Codes?
It seems to me that they would be an ideal way to initialise a large set
of replica servers. They could also be used in the persist update case,
avoiding the need for any sort of back-channel.

For those who have not met them, Fountain Codes allow you to broadcast
large datasets to an unknown number of receivers over lossy channels.
If well designed, each receiver needs to collect any randomly-chosen
subset of datagrams adding up to a few percent more bytes than the
source data. One such code is described in RFC5053, though there
appear to be patent issues to be considered.

Andrew
--
-----------------------------------------------------------------------
| From Andrew Findlay, Skills 1st Ltd |
| Consultant in large-scale systems, networks, and directory services |
| http://www.skills-1st.co.uk/ +44 1628 782565 |
-----------------------------------------------------------------------
Howard Chu
2015-02-09 15:26:36 UTC
Permalink
Content preview: Andrew Findlay wrote: > On Sun, Feb 08, 2015 at 12:52:40PM
+0000, Howard Chu wrote: > >> Been thinking this would be worth trying for
a while now. Set a >> config option for syncprov to send Persist messages
to a multicast >> group instead of the original TCP session. All the consumers
would >> also join the group and listen for updates. This would also exercise
Post by Andrew Findlay
the cldap:// support in libldap. > > It certainly makes sense to have
the network do more of the work > where it can. This would be particularly
valuable in the high fan-out > case. > > Encryption and message signing needs
some thought - this is usually > harder to get right in datagram protocols
than in streams. > > While we are talking datagrams and multicast, have you
looked at Fountain Codes? > It seems to me that they would be an ideal way
to initialise a large set > of replica servers. They could also be used in
the persist update case, > avoiding the need for any sort of back-channel.
[...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Andrew Findlay
Been thinking this would be worth trying for a while now. Set a
config option for syncprov to send Persist messages to a multicast
group instead of the original TCP session. All the consumers would
also join the group and listen for updates. This would also exercise
the cldap:// support in libldap.
It certainly makes sense to have the network do more of the work
where it can. This would be particularly valuable in the high fan-out
case.
Encryption and message signing needs some thought - this is usually
harder to get right in datagram protocols than in streams.
While we are talking datagrams and multicast, have you looked at Fountain Codes?
It seems to me that they would be an ideal way to initialise a large set
of replica servers. They could also be used in the persist update case,
avoiding the need for any sort of back-channel.
Interesting reading. Seems a bit of overkill to me though; that's
designed for multicast to millions of subscribers where a back-channel
isn't feasible. Syncrepl would never be used with such a high fanout,
and we already have the back-channel anyway, why not keep using it?
Post by Andrew Findlay
For those who have not met them, Fountain Codes allow you to broadcast
large datasets to an unknown number of receivers over lossy channels.
If well designed, each receiver needs to collect any randomly-chosen
subset of datagrams adding up to a few percent more bytes than the
source data. One such code is described in RFC5053, though there
appear to be patent issues to be considered.
Andrew
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Loading...