Discussion:
syncrepl consumer is slow
Howard Chu
2015-01-29 03:12:17 UTC
Permalink
Content preview: One thing I just noticed, while testing replication with 3
servers on my laptop - during a refresh, the provider gets blocked waiting
to write to the consumers after writing about 4000 entries. I.e., the consumers
aren't processing fast enough to keep up with the search running on the provider.
[...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]

One thing I just noticed, while testing replication with 3 servers on my
laptop - during a refresh, the provider gets blocked waiting to write to
the consumers after writing about 4000 entries. I.e., the consumers
aren't processing fast enough to keep up with the search running on the
provider.

(That's actually not too surprising since reads are usually faster than
writes anyway.)

The consumer code has lots of problems as it is, just adding this note
to the pile.

I'm considering adding an option to the consumer to write its entries
with dbnosync during the refresh phase. The rationale being, there's
nothing to lose anyway if the refresh is interrupted. I.e., the consumer
can't update its contextCSN until the very end of the refresh, so any
partial refresh that gets interrupted is wasted effort - the consumer
will always have to start over from the beginning on its next refresh
attempt. As such, there's no point in safely/synchronously writing any
of the received entries - they're useless until the final contextCSN update.

The implementation approach would be to define a new control e.g. "fast
write" for the consumer to pass to the underlying backend on any write
op. We would also have to e.g. add an MDB_TXN_NOSYNC flag to
mdb_txn_begin() (BDB already has the equivalent flag).

This would only be used for writes that are part of a refresh phase. In
persist mode the provider and consumers' write speeds should be more
closely matched so it wouldn't be necessary or useful.

Comments?
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Quanah Gibson-Mount
2015-01-29 04:20:55 UTC
Permalink
Content preview: --On January 29, 2015 at 3:12:17 AM +0000 Howard Chu <***@symas.com>
wrote: > This would only be used for writes that are part of a refresh phase.
In > persist mode the provider and consumers' write speeds should be more
closely matched so it wouldn't be necessary or useful. [...]
Content analysis details: (-2.0 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 RCVD_IN_DNSWL_BLOCKED RBL: ADMINISTRATOR NOTICE: The query to DNSWL
was blocked. See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[162.209.122.174 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: zimbra.com]
-0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
domain
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
This would only be used for writes that are part of a refresh phase. In
persist mode the provider and consumers' write speeds should be more
closely matched so it wouldn't be necessary or useful.
I've had a few cases on extremely busy systems with multiple replicas/mmr
nodes where they literally never catch up. Only way I've been able to
resolve those cases is to stop them, slapcat the master, slapadd, and
restart. Hopefully this change would alleviate that scenario.

--Quanah
--
Quanah Gibson-Mount
Platform Architect
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration
Howard Chu
2015-01-29 04:34:28 UTC
Permalink
Content preview: Quanah Gibson-Mount wrote: > > > --On January 29, 2015 at
3:12:17 AM +0000 Howard Chu <***@symas.com> wrote: > > >> This would only
be used for writes that are part of a refresh phase. In >> persist mode the
provider and consumers' write speeds should be more >> closely matched so
it wouldn't be necessary or useful. > > I've had a few cases on extremely
busy systems with multiple > replicas/mmr nodes where they literally never
catch up. Only way I've > been able to resolve those cases is to stop them,
slapcat the master, > slapadd, and restart. Hopefully this change would alleviate
that scenario. [...]

Content analysis details: (-1.9 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 RCVD_IN_DNSWL_BLOCKED RBL: ADMINISTRATOR NOTICE: The query to DNSWL
was blocked. See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Quanah Gibson-Mount
Post by Howard Chu
This would only be used for writes that are part of a refresh phase. In
persist mode the provider and consumers' write speeds should be more
closely matched so it wouldn't be necessary or useful.
I've had a few cases on extremely busy systems with multiple
replicas/mmr nodes where they literally never catch up. Only way I've
been able to resolve those cases is to stop them, slapcat the master,
slapadd, and restart. Hopefully this change would alleviate that scenario.
Yes, I'm seeing the same thing. And yes, that's my hope as well. Not
sure if it's enough; like I said there are other performance issues in
the consumer code.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Emmanuel Lécharny
2015-01-29 07:10:44 UTC
Permalink
Content preview: Le 29/01/15 04:12, Howard Chu a écrit : > One thing I just
noticed, while testing replication with 3 servers on > my laptop - during
a refresh, the provider gets blocked waiting to > write to the consumers
after writing about 4000 entries. I.e., the > consumers aren't processing
fast enough to keep up with the search > running on the provider. > > (That's
actually not too surprising since reads are usually faster > than writes
anyway.) > > The consumer code has lots of problems as it is, just adding
this note > to the pile. > > I'm considering adding an option to the consumer
to write its entries > with dbnosync during the refresh phase. The rationale
being, there's > nothing to lose anyway if the refresh is interrupted. I.e.,
the > consumer can't update its contextCSN until the very end of the > refresh,
so any partial refresh that gets interrupted is wasted effort > - the consumer
will always have to start over from the beginning on > its next refresh attempt.
As such, there's no point in > safely/synchronously writing any of the received
entries - they're > useless until the final contextCSN update. > > The implementation
approach would be to define a new control e.g. > "fast write" for the consumer
to pass to the underlying backend on any > write op. We would also have to
e.g. add an MDB_TXN_NOSYNC flag to > mdb_txn_begin() (BDB already has the
equivalent flag). > > This would only be used for writes that are part of
a refresh phase. > In persist mode the provider and consumers' write speeds
should be > more closely matched so it wouldn't be necessary or useful. >
Comments? [...]
Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[209.85.212.174 listed in list.dnswl.org]
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -2.7 (--)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Le 29/01/15 04:12, Howard Chu a écrit : > One thing I just
noticed, while testing replication with 3 servers on > my laptop - during
a refresh, the provider gets blocked waiting to > write to the consumers
after writing about 4000 entries. I.e., the > consumers aren't processing
fast enough to keep up with the search > running on the provider. > > (That's
actually not too surprising since reads are usually faster > than writes
anyway.) > > The consumer code has lots of problems as it is, just adding
this note > to the pile. > > I'm considering adding an option to the consumer
to write its entries > with dbnosync during the refresh phase. The rationale
being, there's > nothing to lose anyway if the refresh is interrupted. I.e.,
the > consumer can't update its contextCSN until the very end of the > refresh,
so any partial refresh that gets interrupted is wasted effort > - the consumer
will always have to start over from the beginning on > its next refresh attempt.
As such, there's no point in > safely/synchronously writing any of the received
entries - they're > useless until the final contextCSN update. > > The implementation
approach would be to define a new control e.g. > "fast write" for the consumer
to pass to the underlying backend on any > write op. We would also have to
e.g. add an MDB_TXN_NOSYNC flag to > mdb_txn_begin() (BDB already has the
equivalent flag). > > This would only be used for writes that are part of
a refresh phase. > In persist mode the provider and consumers' write speeds
should be > more closely matched so it wouldn't be necessary or useful. >
Comments? [...]
Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[209.85.212.174 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
One thing I just noticed, while testing replication with 3 servers on
my laptop - during a refresh, the provider gets blocked waiting to
write to the consumers after writing about 4000 entries. I.e., the
consumers aren't processing fast enough to keep up with the search
running on the provider.
(That's actually not too surprising since reads are usually faster
than writes anyway.)
The consumer code has lots of problems as it is, just adding this note
to the pile.
I'm considering adding an option to the consumer to write its entries
with dbnosync during the refresh phase. The rationale being, there's
nothing to lose anyway if the refresh is interrupted. I.e., the
consumer can't update its contextCSN until the very end of the
refresh, so any partial refresh that gets interrupted is wasted effort
- the consumer will always have to start over from the beginning on
its next refresh attempt. As such, there's no point in
safely/synchronously writing any of the received entries - they're
useless until the final contextCSN update.
The implementation approach would be to define a new control e.g.
"fast write" for the consumer to pass to the underlying backend on any
write op. We would also have to e.g. add an MDB_TXN_NOSYNC flag to
mdb_txn_begin() (BDB already has the equivalent flag).
This would only be used for writes that are part of a refresh phase.
In persist mode the provider and consumers' write speeds should be
more closely matched so it wouldn't be necessary or useful.
Comments?
The proposal sounds sane.

Speaking of which we had a discussion about some other features that
could be fine to have : when a consumer reconnect to a provider, the
consumer has no idea about how many entries it will receives. It would
be valuable to pass an extra information in the exchanged cookie, which
would be the number of updated entries. That could provide a hint for
users or admin who would like to know about how long the update would
take on a consumer (assuming we log such an information). Also batching
the updates in the backend, ie grouping the updates before syncing them,
could be interesting to have, still associated with some logs, again
allowing the admin/user to know about the update progression.

Something like:

syncrepl : 1240 entries to update
syncrpel : 200/1240 entries updated
syncrpel : 400/1240 entries updated
...
syncrepl : server up to date.
Hallvard Breien Furuseth
2015-01-30 09:21:40 UTC
Permalink
Content preview: On 29. jan. 2015 04:12, Howard Chu wrote: > I'm considering
adding an option to the consumer to write its entries with > dbnosync during
the refresh phase. The rationale being, there's nothing to > lose anyway
if the refresh is interrupted. I.e., the consumer can't update > its contextCSN
until the very end of the refresh, so any partial refresh that > gets interrupted
is wasted effort - the consumer will always have to start > over from the
beginning on its next refresh attempt. [...]

Content analysis details: (-1.9 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
domain
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
I'm considering adding an option to the consumer to write its entries with
dbnosync during the refresh phase. The rationale being, there's nothing to
lose anyway if the refresh is interrupted. I.e., the consumer can't update
its contextCSN until the very end of the refresh, so any partial refresh that
gets interrupted is wasted effort - the consumer will always have to start
over from the beginning on its next refresh attempt.
dbnosync loses consistency after a system crash, and it loses the knowledge
that the DB may be inconsistent. At least with back-mdb. The safe thing
to do after such a crash is to throw away the DB and fetch the entire thing
from the provider. Which I gather would need to happen automatically
with such an option.
--
Hallvard
Michael Ströder
2015-01-30 09:30:31 UTC
Permalink
Post by Hallvard Breien Furuseth
I'm considering adding an option to the consumer to write its entries with
dbnosync during the refresh phase. The rationale being, there's nothing to
lose anyway if the refresh is interrupted. I.e., the consumer can't update
its contextCSN until the very end of the refresh, so any partial refresh that
gets interrupted is wasted effort - the consumer will always have to start
over from the beginning on its next refresh attempt.
dbnosync loses consistency after a system crash, and it loses the knowledge
that the DB may be inconsistent. At least with back-mdb. The safe thing
to do after such a crash is to throw away the DB and fetch the entire thing
from the provider. Which I gather would need to happen automatically
with such an option.
From my purely operatinal standpoint:

The consumer does not have valid contextCSN before being fully synced. This
must be ensured. Everyting else can be handled separately. In a serious
deployment the monitoring will have the red light on for this replica, decent
health-check in load-balancers will disable using this replica.

=> don't over-engineer too many things to happen automagically, especially if
you're not 100% sure that this auto-magic is rock-solid on every supported OS
platform and in every exotic operational situation.

Ciao, Michael.
Howard Chu
2015-02-03 04:11:50 UTC
Permalink
Content preview: Hallvard Breien Furuseth wrote: > On 29. jan. 2015 04:12,
Howard Chu wrote: >> I'm considering adding an option to the consumer to write
its entries >> with >> dbnosync during the refresh phase. The rationale being,
there's >> nothing to >> lose anyway if the refresh is interrupted. I.e.,
the consumer can't >> update >> its contextCSN until the very end of the
refresh, so any partial >> refresh that >> gets interrupted is wasted effort
- the consumer will always have to >> start >> over from the beginning on
its next refresh attempt. > > dbnosync loses consistency after a system crash,
and it loses the knowledge > that the DB may be inconsistent. At least with
back-mdb. The safe thing > to do after such a crash is to throw away the
DB and fetch the entire thing > from the provider. Which I gather would need
to happen automatically > with such an option. > Another option here is simply
to perform batching. Now that we have the TXN api exposed in the backend
interface, we could just batch up e.g. 500 entries per txn. much like slapadd
-q already does. Ultimately we ought to be able to get syncrepl refresh to
occur at nearly the same speed as slapadd -q. [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Hallvard Breien Furuseth
I'm considering adding an option to the consumer to write its entries with
dbnosync during the refresh phase. The rationale being, there's nothing to
lose anyway if the refresh is interrupted. I.e., the consumer can't update
its contextCSN until the very end of the refresh, so any partial refresh that
gets interrupted is wasted effort - the consumer will always have to start
over from the beginning on its next refresh attempt.
dbnosync loses consistency after a system crash, and it loses the knowledge
that the DB may be inconsistent. At least with back-mdb. The safe thing
to do after such a crash is to throw away the DB and fetch the entire thing
from the provider. Which I gather would need to happen automatically
with such an option.
Another option here is simply to perform batching. Now that we have the
TXN api exposed in the backend interface, we could just batch up e.g.
500 entries per txn. much like slapadd -q already does. Ultimately we
ought to be able to get syncrepl refresh to occur at nearly the same
speed as slapadd -q.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Emmanuel Lécharny
2015-02-03 05:13:44 UTC
Permalink
Content preview: Le 03/02/15 05:11, Howard Chu a écrit : > Hallvard Breien
Furuseth wrote: >> On 29. jan. 2015 04:12, Howard Chu wrote: >>> I'm considering
adding an option to the consumer to write its entries >>> with >>> dbnosync
during the refresh phase. The rationale being, there's >>> nothing to >>>
lose anyway if the refresh is interrupted. I.e., the consumer can't >>> update
Post by Howard Chu
Post by Hallvard Breien Furuseth
its contextCSN until the very end of the refresh, so any partial >>>
refresh that >>> gets interrupted is wasted effort - the consumer will always
have to >>> start >>> over from the beginning on its next refresh attempt.
Post by Howard Chu
Post by Hallvard Breien Furuseth
dbnosync loses consistency after a system crash, and it loses the >>
knowledge >> that the DB may be inconsistent. At least with back-mdb. The
safe >> thing >> to do after such a crash is to throw away the DB and fetch
the entire >> thing >> from the provider. Which I gather would need to happen
automatically >> with such an option. >> > Another option here is simply
to perform batching. Now that we have > the TXN api exposed in the backend
interface, we could just batch up > e.g. 500 entries per txn. much like slapadd
-q already does. > Ultimately we ought to be able to get syncrepl refresh
to occur at > nearly the same speed as slapadd -q. [...]

Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[74.125.82.172 listed in list.dnswl.org]
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -2.7 (--)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Le 03/02/15 05:11, Howard Chu a écrit : > Hallvard Breien
Furuseth wrote: >> On 29. jan. 2015 04:12, Howard Chu wrote: >>> I'm considering
adding an option to the consumer to write its entries >>> with >>> dbnosync
during the refresh phase. The rationale being, there's >>> nothing to >>>
lose anyway if the refresh is interrupted. I.e., the consumer can't >>> update
Post by Howard Chu
Post by Hallvard Breien Furuseth
its contextCSN until the very end of the refresh, so any partial >>>
refresh that >>> gets interrupted is wasted effort - the consumer will always
have to >>> start >>> over from the beginning on its next refresh attempt.
Post by Howard Chu
Post by Hallvard Breien Furuseth
dbnosync loses consistency after a system crash, and it loses the >>
knowledge >> that the DB may be inconsistent. At least with back-mdb. The
safe >> thing >> to do after such a crash is to throw away the DB and fetch
the entire >> thing >> from the provider. Which I gather would need to happen
automatically >> with such an option. >> > Another option here is simply
to perform batching. Now that we have > the TXN api exposed in the backend
interface, we could just batch up > e.g. 500 entries per txn. much like slapadd
-q already does. > Ultimately we ought to be able to get syncrepl refresh
to occur at > nearly the same speed as slapadd -q. [...]

Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[74.125.82.172 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
Post by Howard Chu
Post by Hallvard Breien Furuseth
I'm considering adding an option to the consumer to write its entries with
dbnosync during the refresh phase. The rationale being, there's nothing to
lose anyway if the refresh is interrupted. I.e., the consumer can't update
its contextCSN until the very end of the refresh, so any partial refresh that
gets interrupted is wasted effort - the consumer will always have to start
over from the beginning on its next refresh attempt.
dbnosync loses consistency after a system crash, and it loses the knowledge
that the DB may be inconsistent. At least with back-mdb. The safe thing
to do after such a crash is to throw away the DB and fetch the entire thing
from the provider. Which I gather would need to happen automatically
with such an option.
Another option here is simply to perform batching. Now that we have
the TXN api exposed in the backend interface, we could just batch up
e.g. 500 entries per txn. much like slapadd -q already does.
Ultimately we ought to be able to get syncrepl refresh to occur at
nearly the same speed as slapadd -q.
Batching is ok, except that you never know how many entries you'll going
to have, thus you will have to actually write the data after a period of
time, even if you don't have the 500 entries.

This is where it would be cool to extend the cookie to receive the
expected number of updates you are going to receive (which will be
obviously be 1 in a normal running R&P replication, but > 1 most of the
time when reconnecting). In this case, youc an anticipate the batching
operation without having to tke care of the time issue.

My 2 cts.
Howard Chu
2015-02-03 08:41:30 UTC
Permalink
Content preview: Emmanuel Lécharny wrote: > Le 03/02/15 05:11, Howard Chu
a écrit : >> Another option here is simply to perform batching. Now that
we have >> the TXN api exposed in the backend interface, we could just batch
up >> e.g. 500 entries per txn. much like slapadd -q already does. >> Ultimately
we ought to be able to get syncrepl refresh to occur at >> nearly the same
speed as slapadd -q. > > Batching is ok, except that you never know how many
entries you'll going > to have, thus you will have to actually write the
data after a period of > time, even if you don't have the 500 entries. [...]


Content analysis details: (-1.9 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: openldap.org]
0.0 RCVD_IN_DNSWL_BLOCKED RBL: ADMINISTRATOR NOTICE: The query to DNSWL
was blocked. See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[69.43.206.106 listed in list.dnswl.org]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -1.9 (-)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Emmanuel Lécharny wrote: > Le 03/02/15 05:11, Howard Chu
a écrit : >> Another option here is simply to perform batching. Now that
we have >> the TXN api exposed in the backend interface, we could just batch
up >> e.g. 500 entries per txn. much like slapadd -q already does. >> Ultimately
we ought to be able to get syncrepl refresh to occur at >> nearly the same
speed as slapadd -q. > > Batching is ok, except that you never know how many
entries you'll going > to have, thus you will have to actually write the
data after a period of > time, even if you don't have the 500 entries. [...]


Content analysis details: (-1.9 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 RCVD_IN_DNSWL_BLOCKED RBL: ADMINISTRATOR NOTICE: The query to DNSWL
was blocked. See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Emmanuel Lécharny
Post by Howard Chu
Another option here is simply to perform batching. Now that we have
the TXN api exposed in the backend interface, we could just batch up
e.g. 500 entries per txn. much like slapadd -q already does.
Ultimately we ought to be able to get syncrepl refresh to occur at
nearly the same speed as slapadd -q.
Batching is ok, except that you never know how many entries you'll going
to have, thus you will have to actually write the data after a period of
time, even if you don't have the 500 entries.
This isn't a problem - we know exactly when refresh completes, so we can
finish the batch regardless of how many entries are left over.

Testing this out with the experimental ITS#8040 patch - with lazy commit
the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh to pull
them across. With batching 500 entries/txn+lazy commit it takes ~7
minutes, a decent improvement. It's still 2x slower than slapadd -q
though, which loads the data in 3-1/2 minutes.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Howard Chu
2015-02-03 08:54:26 UTC
Permalink
Post by Howard Chu
Post by Emmanuel Lécharny
Post by Howard Chu
Another option here is simply to perform batching. Now that we have
the TXN api exposed in the backend interface, we could just batch up
e.g. 500 entries per txn. much like slapadd -q already does.
Ultimately we ought to be able to get syncrepl refresh to occur at
nearly the same speed as slapadd -q.
Batching is ok, except that you never know how many entries you'll going
to have, thus you will have to actually write the data after a period of
time, even if you don't have the 500 entries.
This isn't a problem - we know exactly when refresh completes, so we can
finish the batch regardless of how many entries are left over.
Testing this out with the experimental ITS#8040 patch - with lazy commit
the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh to pull
them across. With batching 500 entries/txn+lazy commit it takes ~7
minutes, a decent improvement. It's still 2x slower than slapadd -q
though, which loads the data in 3-1/2 minutes.
In case anyone else wants to try this out, patch attached.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Emmanuel Lécharny
2015-02-03 09:42:19 UTC
Permalink
Content preview: Le 03/02/15 09:41, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Le 03/02/15 05:11, Howard Chu a écrit : >>> Another option here
is simply to perform batching. Now that we have >>> the TXN api exposed in
the backend interface, we could just batch up >>> e.g. 500 entries per txn.
much like slapadd -q already does. >>> Ultimately we ought to be able to
get syncrepl refresh to occur at >>> nearly the same speed as slapadd -q.
Post by Howard Chu
Post by Emmanuel Lécharny
Post by Howard Chu
Post by Emmanuel Lécharny
Batching is ok, except that you never know how many entries you'll going
to have, thus you will have to actually write the data after a period
of >> time, even if you don't have the 500 entries. > > This isn't a problem
- we know exactly when refresh completes, so we > can finish the batch regardless
of how many entries are left over. [...]

Content analysis details: (-2.0 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
0.0 RCVD_IN_DNSWL_BLOCKED RBL: ADMINISTRATOR NOTICE: The query to DNSWL
was blocked. See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[209.85.212.180 listed in list.dnswl.org]
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -2.0 (--)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Le 03/02/15 09:41, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Le 03/02/15 05:11, Howard Chu a écrit : >>> Another option here
is simply to perform batching. Now that we have >>> the TXN api exposed in
the backend interface, we could just batch up >>> e.g. 500 entries per txn.
much like slapadd -q already does. >>> Ultimately we ought to be able to
get syncrepl refresh to occur at >>> nearly the same speed as slapadd -q.
Post by Howard Chu
Post by Emmanuel Lécharny
Post by Howard Chu
Post by Emmanuel Lécharny
Batching is ok, except that you never know how many entries you'll going
to have, thus you will have to actually write the data after a period
of >> time, even if you don't have the 500 entries. > > This isn't a problem
- we know exactly when refresh completes, so we > can finish the batch regardless
of how many entries are left over. [...]

Content analysis details: (-2.0 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 RCVD_IN_DNSWL_BLOCKED RBL: ADMINISTRATOR NOTICE: The query to DNSWL
was blocked. See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[209.85.212.180 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
Post by Howard Chu
Post by Emmanuel Lécharny
Post by Howard Chu
Another option here is simply to perform batching. Now that we have
the TXN api exposed in the backend interface, we could just batch up
e.g. 500 entries per txn. much like slapadd -q already does.
Ultimately we ought to be able to get syncrepl refresh to occur at
nearly the same speed as slapadd -q.
Batching is ok, except that you never know how many entries you'll going
to have, thus you will have to actually write the data after a period of
time, even if you don't have the 500 entries.
This isn't a problem - we know exactly when refresh completes, so we
can finish the batch regardless of how many entries are left over.
True for Refresh. I was thinking more specifically of updates when we
are connected.

The idea of pushing the expected number of updates within the cookie is
for information purposes : having this number traced in the
logs/monitored could help in some cases where the refresh phase takes
long : the users will not stop the server thinking it has stalled.
Post by Howard Chu
Testing this out with the experimental ITS#8040 patch - with lazy
commit the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh
to pull them across. With batching 500 entries/txn+lazy commit it
takes ~7 minutes, a decent improvement. It's still 2x slower than
slapadd -q though, which loads the data in 3-1/2 minutes.
Not bad at all. What makes it 2x slower, btw?
Howard Chu
2015-02-03 09:54:39 UTC
Permalink
Content preview: Howard Chu wrote: > Emmanuel Lécharny wrote: >> Le 03/02/15
09:41, Howard Chu a écrit : >>> Emmanuel Lécharny wrote: >>>> Le 03/02/15
05:11, Howard Chu a écrit : >>>>> Another option here is simply to perform
batching. Now that we have >>>>> the TXN api exposed in the backend interface,
we could just batch up >>>>> e.g. 500 entries per txn. much like slapadd
-q already does. >>>>> Ultimately we ought to be able to get syncrepl refresh
to occur at >>>>> nearly the same speed as slapadd -q. >>>> >>>> Batching
is ok, except that you never know how many entries you'll >>>> going >>>>
to have, thus you will have to actually write the data after a >>>> period
of >>>> time, even if you don't have the 500 entries. >>> >>> This isn't
a problem - we know exactly when refresh completes, so we >>> can finish the
batch regardless of how many entries are left over. >> >> True for Refresh.
I was thinking more specifically of updates when we >> are connected. > >
None of this is for Persist phase, I have only been talking about refresh.
Post by Emmanuel Lécharny
Post by Howard Chu
Testing this out with the experimental ITS#8040 patch - with lazy >>>
commit the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh >>>
to pull them across. With batching 500 entries/txn+lazy commit it >>> takes
~7 minutes, a decent improvement. It's still 2x slower than >>> slapadd -q
though, which loads the data in 3-1/2 minutes. >> >> Not bad at all. What
makes it 2x slower, btw? > > Still looking into it. slapadd -q uses 2 threads,
one to parse the LDIF > and one to write to the DB. syncrepl consumer only
uses 1 thread. > Probably if we split reading from the network apart from
writing to the > DB, that would make the difference. [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: symas.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -4.2 (----)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Howard Chu wrote: > Emmanuel Lécharny wrote: >> Le 03/02/15
09:41, Howard Chu a écrit : >>> Emmanuel Lécharny wrote: >>>> Le 03/02/15
05:11, Howard Chu a écrit : >>>>> Another option here is simply to perform
batching. Now that we have >>>>> the TXN api exposed in the backend interface,
we could just batch up >>>>> e.g. 500 entries per txn. much like slapadd
-q already does. >>>>> Ultimately we ought to be able to get syncrepl refresh
to occur at >>>>> nearly the same speed as slapadd -q. >>>> >>>> Batching
is ok, except that you never know how many entries you'll >>>> going >>>>
to have, thus you will have to actually write the data after a >>>> period
of >>>> time, even if you don't have the 500 entries. >>> >>> This isn't
a problem - we know exactly when refresh completes, so we >>> can finish the
batch regardless of how many entries are left over. >> >> True for Refresh.
I was thinking more specifically of updates when we >> are connected. > >
None of this is for Persist phase, I have only been talking about refresh.
Post by Emmanuel Lécharny
Post by Howard Chu
Testing this out with the experimental ITS#8040 patch - with lazy >>>
commit the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh >>>
to pull them across. With batching 500 entries/txn+lazy commit it >>> takes
~7 minutes, a decent improvement. It's still 2x slower than >>> slapadd -q
though, which loads the data in 3-1/2 minutes. >> >> Not bad at all. What
makes it 2x slower, btw? > > Still looking into it. slapadd -q uses 2 threads,
one to parse the LDIF > and one to write to the DB. syncrepl consumer only
uses 1 thread. > Probably if we split reading from the network apart from
writing to the > DB, that would make the difference. [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Emmanuel Lécharny
Post by Howard Chu
Post by Howard Chu
Another option here is simply to perform batching. Now that we have
the TXN api exposed in the backend interface, we could just batch up
e.g. 500 entries per txn. much like slapadd -q already does.
Ultimately we ought to be able to get syncrepl refresh to occur at
nearly the same speed as slapadd -q.
Batching is ok, except that you never know how many entries you'll going
to have, thus you will have to actually write the data after a period of
time, even if you don't have the 500 entries.
This isn't a problem - we know exactly when refresh completes, so we
can finish the batch regardless of how many entries are left over.
True for Refresh. I was thinking more specifically of updates when we
are connected.
None of this is for Persist phase, I have only been talking about refresh.
Post by Emmanuel Lécharny
Post by Howard Chu
Testing this out with the experimental ITS#8040 patch - with lazy
commit the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh
to pull them across. With batching 500 entries/txn+lazy commit it
takes ~7 minutes, a decent improvement. It's still 2x slower than
slapadd -q though, which loads the data in 3-1/2 minutes.
Not bad at all. What makes it 2x slower, btw?
Still looking into it. slapadd -q uses 2 threads, one to parse the LDIF
and one to write to the DB. syncrepl consumer only uses 1 thread.
Probably if we split reading from the network apart from writing to the
DB, that would make the difference.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Emmanuel Lécharny
2015-02-03 13:16:34 UTC
Permalink
Post by Howard Chu
Post by Emmanuel Lécharny
Emmanuel Lécharny wrote: >>>>> Le 03/02/15 05:11, Howard Chu a écrit
: >>>>>> Another option here is simply to perform batching. Now that we have
Post by Howard Chu
Post by Emmanuel Lécharny
Post by Emmanuel Lécharny
Post by Howard Chu
the TXN api exposed in the backend interface, we could just batch
up >>>>>> e.g. 500 entries per txn. much like slapadd -q already does. >>>>>>
Ultimately we ought to be able to get syncrepl refresh to occur at >>>>>>
nearly the same speed as slapadd -q. >>>>> >>>>> Batching is ok, except that
you never know how many entries you'll >>>>> going >>>>> to have, thus you
will have to actually write the data after a >>>>> period of >>>>> time,
even if you don't have the 500 entries. >>>> >>>> This isn't a problem - we
know exactly when refresh completes, so we >>>> can finish the batch regardless
of how many entries are left over. >>> >>> True for Refresh. I was thinking
more specifically of updates when we >>> are connected. >> >> None of this
is for Persist phase, I have only been talking about >> refresh. Thanks for
the clarification. [...]

Content analysis details: (-2.0 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -2.0 (--)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.
Post by Howard Chu
Post by Emmanuel Lécharny
Emmanuel Lécharny wrote: >>>>> Le 03/02/15 05:11, Howard Chu a écrit
: >>>>>> Another option here is simply to perform batching. Now that we have
Post by Howard Chu
Post by Emmanuel Lécharny
Post by Emmanuel Lécharny
Post by Howard Chu
the TXN api exposed in the backend interface, we could just batch
up >>>>>> e.g. 500 entries per txn. much like slapadd -q already does. >>>>>>
Ultimately we ought to be able to get syncrepl refresh to occur at >>>>>>
nearly the same speed as slapadd -q. >>>>> >>>>> Batching is ok, except that
you never know how many entries you'll >>>>> going >>>>> to have, thus you
will have to actually write the data after a >>>>> period of >>>>> time,
even if you don't have the 500 entries. >>>> >>>> This isn't a problem - we
know exactly when refresh completes, so we >>>> can finish the batch regardless
of how many entries are left over. >>> >>> True for Refresh. I was thinking
more specifically of updates when we >>> are connected. >> >> None of this
is for Persist phase, I have only been talking about >> refresh. Thanks for
the clarification. [...]

Content analysis details: (-2.0 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
Post by Howard Chu
Post by Emmanuel Lécharny
Post by Emmanuel Lécharny
Post by Howard Chu
Another option here is simply to perform batching. Now that we have
the TXN api exposed in the backend interface, we could just batch up
e.g. 500 entries per txn. much like slapadd -q already does.
Ultimately we ought to be able to get syncrepl refresh to occur at
nearly the same speed as slapadd -q.
Batching is ok, except that you never know how many entries you'll going
to have, thus you will have to actually write the data after a period of
time, even if you don't have the 500 entries.
This isn't a problem - we know exactly when refresh completes, so we
can finish the batch regardless of how many entries are left over.
True for Refresh. I was thinking more specifically of updates when we
are connected.
None of this is for Persist phase, I have only been talking about refresh.
Thanks for the clarification.
Post by Howard Chu
Post by Emmanuel Lécharny
Testing this out with the experimental ITS#8040 patch - with lazy
commit the 2.8M entries (2.5GB data) takes ~10 minutes for the refresh
to pull them across. With batching 500 entries/txn+lazy commit it
takes ~7 minutes, a decent improvement. It's still 2x slower than
slapadd -q though, which loads the data in 3-1/2 minutes.
Not bad at all. What makes it 2x slower, btw?
Still looking into it. slapadd -q uses 2 threads, one to parse the LDIF
and one to write to the DB. syncrepl consumer only uses 1 thread.
Probably if we split reading from the network apart from writing to the
DB, that would make the difference.
That would worth a try. Although I can expect the disk access to be the
bottleneck here, and using two threads migth swamp the memory, up to a
point. Intersting problem, intersting bechnhmark to conduct ;-)

Emmanuel.
Emmanuel Lécharny
2015-05-11 17:15:49 UTC
Permalink
Content preview: Restarting this thread... we have had some interesting discussion
today that I wanted to share. Hypothesis : 1 server has been down for a long
time, and the contextCSN is older than the one of the other servers, forcing
a refresh mode with more than the content of the AccessLog. [...]

Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[209.85.212.181 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature

Restarting this thread...

we have had some interesting discussion today that I wanted to share.

Hypothesis : 1 server has been down for a long time, and the contextCSN
is older than the one of the other servers, forcing a refresh mode with
more than the content of the AccessLog.

Quanah said that in some heavily servers, the only way for the consumer
to catch up is to slapcat/slapadd/restart the consumer. I wonder if it
would not be a way to deal with server that are to far behind the
running server, but as a mechanism that is included in the refresh phase
(ie, the restarted server will detect that it has to grab the set of
entries and load them, os if a human being was doing a
slapcat/slapadd/restart).

More specifically, is there a way to know how many entries we will have
to update, and is there a way to know when it will be faster to be
brutal (the Quanah way) compared to let the refresh mechanism doing its
job.

Another point : as soon as the server is restarted, it can receive
incoming requests, which will send back outdated response, until the
refresh is completed (and i'm not talking about updates that could also
be applied on an outdated base, with the consequences if there are some
missing parents). In many cases, that would be a real problem, typically
if the LDAP servers are considered as part of a shared pool of server,
with a load balance mecahnism to spread the load. Wouldn't be more
realistic to simply consider the server as not available until the
refresh phase is completed ?

Thanks !
Quanah Gibson-Mount
2015-05-11 17:47:35 UTC
Permalink
Content preview: --On Monday, May 11, 2015 8:15 PM +0200 Emmanuel Lécharny
<***@gmail.com> wrote: > Quanah said that in some heavily servers,
the only way for the consumer > to catch up is to slapcat/slapadd/restart
the consumer. I wonder if it > would not be a way to deal with server that
are to far behind the > running server, but as a mechanism that is included
in the refresh phase > (ie, the restarted server will detect that it has
to grab the set of > entries and load them, os if a human being was doing
a > slapcat/slapadd/restart). [...]

Content analysis details: (-4.3 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[162.209.122.184 listed in list.dnswl.org]
-0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
domain
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -4.3 (----)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: --On Monday, May 11, 2015 8:15 PM +0200 Emmanuel Lécharny
<***@gmail.com> wrote: > Quanah said that in some heavily servers,
the only way for the consumer > to catch up is to slapcat/slapadd/restart
the consumer. I wonder if it > would not be a way to deal with server that
are to far behind the > running server, but as a mechanism that is included
in the refresh phase > (ie, the restarted server will detect that it has
to grab the set of > entries and load them, os if a human being was doing
a > slapcat/slapadd/restart). [...]

Content analysis details: (-4.3 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[162.209.122.184 listed in list.dnswl.org]
-0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
domain
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature

--On Monday, May 11, 2015 8:15 PM +0200 Emmanuel Lécharny
Post by Emmanuel Lécharny
Quanah said that in some heavily servers, the only way for the consumer
to catch up is to slapcat/slapadd/restart the consumer. I wonder if it
would not be a way to deal with server that are to far behind the
running server, but as a mechanism that is included in the refresh phase
(ie, the restarted server will detect that it has to grab the set of
entries and load them, os if a human being was doing a
slapcat/slapadd/restart).
A specific example we had in the past was quarterly updates for students @
Stanford, which could push out 10's of thousands of updates to the
single-node master. Generally of the 6 slaves, 2-3 would remain current,
and the other 3 would fall hours or days behind. Since serving out
siginficantly out of date data was not an option, we'd generally have to
resort to reloading the ones that got stuck behind to get the sync'd up in
a timely fashion.
Post by Emmanuel Lécharny
Another point : as soon as the server is restarted, it can receive
incoming requests, which will send back outdated response, until the
refresh is completed (and i'm not talking about updates that could also
be applied on an outdated base, with the consequences if there are some
missing parents). In many cases, that would be a real problem, typically
if the LDAP servers are considered as part of a shared pool of server,
with a load balance mecahnism to spread the load. Wouldn't be more
realistic to simply consider the server as not available until the
refresh phase is completed ?
There's already an option for this, new for OpenLDAP 2.5 IIRC, that makes
it return LDAP_BUSY or some such until it is "caught up". However, if you
enable that option, it always returns this response, which is problematic,
because a server may routinely flip between "caught up" and not "caught
up". I.e., it is not unusual for a system to be a second or so behind
other masters. Here's real world data from a client I just ran:

[***@zm-mmr01 ~]$ ./libexec/zmreplchk
Master: ldap://zm-mmr01.client.net:389 ServerID: 1 Code: 6 Status: 0y 0M 0w
0d 0h 0m 1s behind CSNs:
20150504222317.897445Z#000000#001#000000
20150511174531.424005Z#000000#002#000000
20150501181032.360324Z#000000#00a#000000
20150511174535.964334Z#000000#00b#000000
Master: ldap://zm-mmr00.client.net:389 ServerID: 2 Code: 0 Status: In Sync
CSNs:
20150504222317.897445Z#000000#001#000000
20150511174531.424005Z#000000#002#000000
20150501181032.360324Z#000000#00a#000000
20150511174535.964334Z#000000#00b#000000
Master: ldap://nvl-mmr10.client.net:389 ServerID: 10 Code: 6 Status: 0y 0M
0w 0d 0h 0m 1s behind CSNs:
20150504222317.897445Z#000000#001#000000
20150511174531.424005Z#000000#002#000000
20150501181032.360324Z#000000#00a#000000
20150511174536.315403Z#000000#00b#000000
Master: ldap://nvl-mmr11.client.net:389 ServerID: 11 Code: 6 Status: 0y 0M
0w 0d 0h 0m 1s behind CSNs:
20150504222317.897445Z#000000#001#000000
20150511174531.424005Z#000000#002#000000
20150501181032.360324Z#000000#00a#000000
20150511174536.315403Z#000000#00b#000000


--Quanah


--

Quanah Gibson-Mount
Platform Architect
Zimbra, Inc.
--------------------
Zimbra :: the leader in open source messaging and collaboration
Howard Chu
2015-05-11 20:17:59 UTC
Permalink
Content preview: Emmanuel Lécharny wrote: > Restarting this thread... > >
we have had some interesting discussion today that I wanted to share. > >
Hypothesis : 1 server has been down for a long time, and the contextCSN >
is older than the one of the other servers, forcing a refresh mode with >
more than the content of the AccessLog. > > Quanah said that in some heavily
servers, the only way for the consumer > to catch up is to slapcat/slapadd/restart
the consumer. I wonder if it > would not be a way to deal with server that
are to far behind the > running server, but as a mechanism that is included
in the refresh phase > (ie, the restarted server will detect that it has
to grab the set of > entries and load them, os if a human being was doing
a > slapcat/slapadd/restart). > > More specifically, is there a way to know
how many entries we will have > to update, and is there a way to know when
it will be faster to be > brutal (the Quanah way) compared to let the refresh
mechanism doing its > job. [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -4.2 (----)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Emmanuel Lécharny wrote: > Restarting this thread... > >
we have had some interesting discussion today that I wanted to share. > >
Hypothesis : 1 server has been down for a long time, and the contextCSN >
is older than the one of the other servers, forcing a refresh mode with >
more than the content of the AccessLog. > > Quanah said that in some heavily
servers, the only way for the consumer > to catch up is to slapcat/slapadd/restart
the consumer. I wonder if it > would not be a way to deal with server that
are to far behind the > running server, but as a mechanism that is included
in the refresh phase > (ie, the restarted server will detect that it has
to grab the set of > entries and load them, os if a human being was doing
a > slapcat/slapadd/restart). > > More specifically, is there a way to know
how many entries we will have > to update, and is there a way to know when
it will be faster to be > brutal (the Quanah way) compared to let the refresh
mechanism doing its > job. [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Emmanuel Lécharny
Restarting this thread...
we have had some interesting discussion today that I wanted to share.
Hypothesis : 1 server has been down for a long time, and the contextCSN
is older than the one of the other servers, forcing a refresh mode with
more than the content of the AccessLog.
Quanah said that in some heavily servers, the only way for the consumer
to catch up is to slapcat/slapadd/restart the consumer. I wonder if it
would not be a way to deal with server that are to far behind the
running server, but as a mechanism that is included in the refresh phase
(ie, the restarted server will detect that it has to grab the set of
entries and load them, os if a human being was doing a
slapcat/slapadd/restart).
More specifically, is there a way to know how many entries we will have
to update, and is there a way to know when it will be faster to be
brutal (the Quanah way) compared to let the refresh mechanism doing its
job.
Not a worthwhile direction to pursue. Doing the equivalent of a full
slapcat/slapadd across the network will use even more bandwidth than the
current syncrepl. None of this addresses the underlying causes of why the
consumer is slow, so the original problem will remain.

There are two main problems:
1) the AVL tree used for presentlist is still extremely inefficient in both
CPU and memory use.
2) the consumer does twice as much work for a single modification as the
provider. I.e., the consumer does a write op to the backend for the
modification, and then a second write op to update its contextCSN. The
provider only does the original modification, and caches the contextCSN update.

If we fix both of these issues, consumer speed should be much faster. Nothing
else is worth investigating until these two areas are reworked.

For (1) I've been considering a stripped down memory-only version of LMDB.
There are plenty of existing memory-only Btree implementations out there
already though, if anyone has a favorite it would probably save us some time
to use an existing library. The Linux kernel has one (lib/btree.c) but it's
under GPL so we can't use it directly.
Post by Emmanuel Lécharny
Another point : as soon as the server is restarted, it can receive
incoming requests, which will send back outdated response, until the
refresh is completed (and i'm not talking about updates that could also
be applied on an outdated base, with the consequences if there are some
missing parents). In many cases, that would be a real problem, typically
if the LDAP servers are considered as part of a shared pool of server,
with a load balance mecahnism to spread the load. Wouldn't be more
realistic to simply consider the server as not available until the
refresh phase is completed ?
This was ITS#7616. We tried it and it caused a lot of problems. It has been
reverted.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Emmanuel Lécharny
2015-05-11 22:34:34 UTC
Permalink
Content preview: Le 11/05/15 22:17, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Restarting this thread... >> >> we have had some interesting discussion
today that I wanted to share. >> >> Hypothesis : 1 server has been down for
a long time, and the contextCSN >> is older than the one of the other servers,
forcing a refresh mode with >> more than the content of the AccessLog. >>
Post by Howard Chu
Post by Emmanuel Lécharny
Quanah said that in some heavily servers, the only way for the consumer
to catch up is to slapcat/slapadd/restart the consumer. I wonder if it
would not be a way to deal with server that are to far behind the >> running
server, but as a mechanism that is included in the refresh phase >> (ie,
the restarted server will detect that it has to grab the set of >> entries
and load them, os if a human being was doing a >> slapcat/slapadd/restart).
Post by Howard Chu
Post by Emmanuel Lécharny
Post by Emmanuel Lécharny
More specifically, is there a way to know how many entries we will
have >> to update, and is there a way to know when it will be faster to be
Post by Howard Chu
Post by Emmanuel Lécharny
brutal (the Quanah way) compared to let the refresh mechanism doing its
job. > > Not a worthwhile direction to pursue. Doing the equivalent of
a full > slapcat/slapadd across the network will use even more bandwidth
than > the current syncrepl. None of this addresses the underlying causes
of > why the consumer is slow, so the original problem will remain. [...]

Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[74.125.82.53 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -2.7 (--)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Le 11/05/15 22:17, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Restarting this thread... >> >> we have had some interesting discussion
today that I wanted to share. >> >> Hypothesis : 1 server has been down for
a long time, and the contextCSN >> is older than the one of the other servers,
forcing a refresh mode with >> more than the content of the AccessLog. >>
Post by Howard Chu
Post by Emmanuel Lécharny
Quanah said that in some heavily servers, the only way for the consumer
to catch up is to slapcat/slapadd/restart the consumer. I wonder if it
would not be a way to deal with server that are to far behind the >> running
server, but as a mechanism that is included in the refresh phase >> (ie,
the restarted server will detect that it has to grab the set of >> entries
and load them, os if a human being was doing a >> slapcat/slapadd/restart).
Post by Howard Chu
Post by Emmanuel Lécharny
Post by Emmanuel Lécharny
More specifically, is there a way to know how many entries we will
have >> to update, and is there a way to know when it will be faster to be
Post by Howard Chu
Post by Emmanuel Lécharny
brutal (the Quanah way) compared to let the refresh mechanism doing its
job. > > Not a worthwhile direction to pursue. Doing the equivalent of
a full > slapcat/slapadd across the network will use even more bandwidth
than > the current syncrepl. None of this addresses the underlying causes
of > why the consumer is slow, so the original problem will remain. [...]

Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[74.125.82.53 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
Post by Howard Chu
Post by Emmanuel Lécharny
Restarting this thread...
we have had some interesting discussion today that I wanted to share.
Hypothesis : 1 server has been down for a long time, and the contextCSN
is older than the one of the other servers, forcing a refresh mode with
more than the content of the AccessLog.
Quanah said that in some heavily servers, the only way for the consumer
to catch up is to slapcat/slapadd/restart the consumer. I wonder if it
would not be a way to deal with server that are to far behind the
running server, but as a mechanism that is included in the refresh phase
(ie, the restarted server will detect that it has to grab the set of
entries and load them, os if a human being was doing a
slapcat/slapadd/restart).
More specifically, is there a way to know how many entries we will have
to update, and is there a way to know when it will be faster to be
brutal (the Quanah way) compared to let the refresh mechanism doing its
job.
Not a worthwhile direction to pursue. Doing the equivalent of a full
slapcat/slapadd across the network will use even more bandwidth than
the current syncrepl. None of this addresses the underlying causes of
why the consumer is slow, so the original problem will remain.
IMHO, network congestion is not a real pb. Assuming you are running a
1Gb ethernet network, the time it takes to transmit 1 milion 1Kb entries
is only 10 seconds. It will be barely noticable compared to the time it
will take to load those 1 M entries into your consumer. Even with a
100Gb ethernet newtork, this is not a big part of the problem.
Post by Howard Chu
1) the AVL tree used for presentlist is still extremely inefficient
in both CPU and memory use.
2) the consumer does twice as much work for a single modification as
the provider. I.e., the consumer does a write op to the backend for
the modification, and then a second write op to update its contextCSN.
Updating the contextCSN is an extra operation on the consumer, but as
you have to update potentially tens of indexes when updating an entry
(on both teh consumer and the producer), it's not really twoce more
work. It's an additianal operation, but that would not double the time
it costs on the producer.

The question would be : how do we update the contextCSN only
periodically, to mitigate this extra cost, and it seems you proposed to
batch the updates for this reason. By using btaches of 500 updates, this
extra cost will be almost unnoticable, and one would expect the work on
the consumer to be the same as on the producer side, right ?
Post by Howard Chu
The provider only does the original modification, and caches the
contextCSN update.
If we fix both of these issues, consumer speed should be much faster.
Nothing else is worth investigating until these two areas are reworked.
Agreed in most of the case. Although for use cases of an important
number of updates have occured while a consumer is off line, another
strategy might work. That this other strategy is to stop the consumer,
slapcat the producer, slapadd the result and restart the server, all
with the command line, instead of having it implemented in the server
code, was what I was suggesting, but this is another story for a corner
case that is not frequent. Plus we don't know at which point this would
be the correct strategy (ie, for how many updates should we consider it
as a better startegy than the current implementation ?).
Post by Howard Chu
For (1) I've been considering a stripped down memory-only version of
LMDB. There are plenty of existing memory-only Btree implementations
out there already though, if anyone has a favorite it would probably
save us some time to use an existing library. The Linux kernel has one
(lib/btree.c) but it's under GPL so we can't use it directly.
Q : do you need to keep the presentList ina BTree at all ?
Post by Howard Chu
Post by Emmanuel Lécharny
Another point : as soon as the server is restarted, it can receive
incoming requests, which will send back outdated response, until the
refresh is completed (and i'm not talking about updates that could also
be applied on an outdated base, with the consequences if there are some
missing parents). In many cases, that would be a real problem, typically
if the LDAP servers are considered as part of a shared pool of server,
with a load balance mecahnism to spread the load. Wouldn't be more
realistic to simply consider the server as not available until the
refresh phase is completed ?
This was ITS#7616. We tried it and it caused a lot of problems. It has
been reverted.
The two options were to either send a referral (not ideal, as we have no
control whatesoever on the client API) and LDAP_BUSY. A third option
would be possible : chaining the request to the server from which the
replication updates are coming from. Doing so will guarantee that the
client will gets a updated version of the data, as the producer is up to
date. There is still an issue though if both servers are replicating
each other (pretty much the pb with referrals). OTOH, if the other
server is also in refresh mode, it should be possible to return a
LDAP_BUSY if it is capable of detecting that the requests come from
another server, not for a client. Maybe it's far fetched...
Howard Chu
2015-05-12 12:34:02 UTC
Permalink
Content preview: Emmanuel Lécharny wrote: > Le 11/05/15 22:17, Howard Chu
a écrit : >> There are two main problems: >> 1) the AVL tree used for presentlist
is still extremely inefficient >> in both CPU and memory use. >> 2) the consumer
does twice as much work for a single modification as >> the provider. I.e.,
the consumer does a write op to the backend for >> the modification, and
then a second write op to update its contextCSN. [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -4.2 (----)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Emmanuel Lécharny wrote: > Le 11/05/15 22:17, Howard Chu
a écrit : >> There are two main problems: >> 1) the AVL tree used for presentlist
is still extremely inefficient >> in both CPU and memory use. >> 2) the consumer
does twice as much work for a single modification as >> the provider. I.e.,
the consumer does a write op to the backend for >> the modification, and
then a second write op to update its contextCSN. [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Emmanuel Lécharny
Post by Howard Chu
1) the AVL tree used for presentlist is still extremely inefficient
in both CPU and memory use.
2) the consumer does twice as much work for a single modification as
the provider. I.e., the consumer does a write op to the backend for
the modification, and then a second write op to update its contextCSN.
Updating the contextCSN is an extra operation on the consumer, but as
you have to update potentially tens of indexes when updating an entry
(on both teh consumer and the producer), it's not really twoce more
work. It's an additianal operation, but that would not double the time
it costs on the producer.
You're forgetting one very important thing - each operation is a single
transaction in the backend, and transactions are synchronous by default. The
main cost is not the indexing, it's the txn fsync, and yes, it is twice the
cost when you're doing two txns instead of just one.
Post by Emmanuel Lécharny
The question would be : how do we update the contextCSN only
periodically, to mitigate this extra cost, and it seems you proposed to
batch the updates for this reason. By using btaches of 500 updates, this
extra cost will be almost unnoticable, and one would expect the work on
the consumer to be the same as on the producer side, right ?
Yes.
Post by Emmanuel Lécharny
Post by Howard Chu
The provider only does the original modification, and caches the
contextCSN update.
If we fix both of these issues, consumer speed should be much faster.
Nothing else is worth investigating until these two areas are reworked.
Agreed in most of the case. Although for use cases of an important
number of updates have occured while a consumer is off line, another
strategy might work. That this other strategy is to stop the consumer,
slapcat the producer, slapadd the result and restart the server, all
with the command line, instead of having it implemented in the server
code, was what I was suggesting, but this is another story for a corner
case that is not frequent. Plus we don't know at which point this would
be the correct strategy (ie, for how many updates should we consider it
as a better startegy than the current implementation ?).
If the consumer has been offline for a long time, then this discussion is
moot. No clients will be looking at it, so the risk of serving out-of-date
information to clients is zero. In that case, it doesn't matter what strategy
you use, they'll all work.
Post by Emmanuel Lécharny
Post by Howard Chu
For (1) I've been considering a stripped down memory-only version of
LMDB. There are plenty of existing memory-only Btree implementations
out there already though, if anyone has a favorite it would probably
save us some time to use an existing library. The Linux kernel has one
(lib/btree.c) but it's under GPL so we can't use it directly.
Q : do you need to keep the presentList ina BTree at all ?
Good question. We process it by doing a single search over the target range,
and removing presentlist entries for each entry returned by the search. Since
the search order is random, we want fast search access to the presentlist.

We could alternatively do a dynamic array and walk the presentlist in order,
doing (entryUUID=x) searches on each element. The overhead of doing X
individual searches is worse than doing one global search though.
Post by Emmanuel Lécharny
Post by Howard Chu
Post by Emmanuel Lécharny
Another point : as soon as the server is restarted, it can receive
incoming requests, which will send back outdated response, until the
refresh is completed (and i'm not talking about updates that could also
be applied on an outdated base, with the consequences if there are some
missing parents). In many cases, that would be a real problem, typically
if the LDAP servers are considered as part of a shared pool of server,
with a load balance mecahnism to spread the load. Wouldn't be more
realistic to simply consider the server as not available until the
refresh phase is completed ?
This was ITS#7616. We tried it and it caused a lot of problems. It has
been reverted.
The two options were to either send a referral (not ideal, as we have no
control whatesoever on the client API) and LDAP_BUSY. A third option
would be possible : chaining the request to the server from which the
replication updates are coming from. Doing so will guarantee that the
client will gets a updated version of the data, as the producer is up to
date. There is still an issue though if both servers are replicating
each other (pretty much the pb with referrals). OTOH, if the other
server is also in refresh mode, it should be possible to return a
LDAP_BUSY if it is capable of detecting that the requests come from
another server, not for a client. Maybe it's far fetched...
In practice, two MMR servers pointed at each other would never make progress.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Emmanuel Lécharny
2015-05-12 13:48:16 UTC
Permalink
Content preview: Le 12/05/15 14:34, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Le 11/05/15 22:17, Howard Chu a écrit : >>> There are two main
problems: >>> 1) the AVL tree used for presentlist is still extremely inefficient
Post by Howard Chu
Post by Emmanuel Lécharny
in both CPU and memory use. >>> 2) the consumer does twice as much work
for a single modification as >>> the provider. I.e., the consumer does a
write op to the backend for >>> the modification, and then a second write
op to update its contextCSN. > >> Updating the contextCSN is an extra operation
on the consumer, but as >> you have to update potentially tens of indexes
when updating an entry >> (on both teh consumer and the producer), it's not
really twoce more >> work. It's an additianal operation, but that would not
double the time >> it costs on the producer. > > You're forgetting one very
important thing - each operation is a > single transaction in the backend,
and transactions are synchronous by > default. The main cost is not the indexing,
it's the txn fsync, and > yes, it is twice the cost when you're doing two
txns instead of just one. [...]

Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[74.125.82.47 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-BeenThere: openldap-***@openldap.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OpenLDAP development discussion list <openldap-devel.openldap.org>
List-Unsubscribe: <http://www.openldap.org/lists/mm/options/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=unsubscribe>
List-Archive: <http://www.openldap.org/lists/openldap-devel/>
List-Post: <mailto:openldap-***@openldap.org>
List-Help: <mailto:openldap-devel-***@openldap.org?subject=help>
List-Subscribe: <http://www.openldap.org/lists/mm/listinfo/openldap-devel>,
<mailto:openldap-devel-***@openldap.org?subject=subscribe>
Errors-To: openldap-devel-***@openldap.org
Sender: "openldap-devel" <openldap-devel-***@openldap.org>
X-Spam-Score: -2.7 (--)
X-Spam-Report: Spam detection software, running on the system "gauss.openldap.net", has
identified this incoming email as possible spam. The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email. If you have any questions, see
the administrator of that system for details.

Content preview: Le 12/05/15 14:34, Howard Chu a écrit : > Emmanuel Lécharny
wrote: >> Le 11/05/15 22:17, Howard Chu a écrit : >>> There are two main
problems: >>> 1) the AVL tree used for presentlist is still extremely inefficient
Post by Howard Chu
Post by Emmanuel Lécharny
in both CPU and memory use. >>> 2) the consumer does twice as much work
for a single modification as >>> the provider. I.e., the consumer does a
write op to the backend for >>> the modification, and then a second write
op to update its contextCSN. > >> Updating the contextCSN is an extra operation
on the consumer, but as >> you have to update potentially tens of indexes
when updating an entry >> (on both teh consumer and the producer), it's not
really twoce more >> work. It's an additianal operation, but that would not
double the time >> it costs on the producer. > > You're forgetting one very
important thing - each operation is a > single transaction in the backend,
and transactions are synchronous by > default. The main cost is not the indexing,
it's the txn fsync, and > yes, it is twice the cost when you're doing two
txns instead of just one. [...]

Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[74.125.82.47 listed in list.dnswl.org]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(elecharny[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
Post by Howard Chu
Post by Emmanuel Lécharny
1) the AVL tree used for presentlist is still extremely inefficient
in both CPU and memory use.
2) the consumer does twice as much work for a single modification as
the provider. I.e., the consumer does a write op to the backend for
the modification, and then a second write op to update its contextCSN.
Updating the contextCSN is an extra operation on the consumer, but as
you have to update potentially tens of indexes when updating an entry
(on both teh consumer and the producer), it's not really twoce more
work. It's an additianal operation, but that would not double the time
it costs on the producer.
You're forgetting one very important thing - each operation is a
single transaction in the backend, and transactions are synchronous by
default. The main cost is not the indexing, it's the txn fsync, and
yes, it is twice the cost when you're doing two txns instead of just one.
Good point.
Post by Howard Chu
Post by Emmanuel Lécharny
The question would be : how do we update the contextCSN only
periodically, to mitigate this extra cost, and it seems you proposed to
batch the updates for this reason. By using btaches of 500 updates, this
extra cost will be almost unnoticable, and one would expect the work on
the consumer to be the same as on the producer side, right ?
Yes.
Post by Emmanuel Lécharny
The provider only does the original modification, and caches the
contextCSN update.
If we fix both of these issues, consumer speed should be much faster.
Nothing else is worth investigating until these two areas are reworked.
Agreed in most of the case. Although for use cases of an important
number of updates have occured while a consumer is off line, another
strategy might work. That this other strategy is to stop the consumer,
slapcat the producer, slapadd the result and restart the server, all
with the command line, instead of having it implemented in the server
code, was what I was suggesting, but this is another story for a corner
case that is not frequent. Plus we don't know at which point this would
be the correct strategy (ie, for how many updates should we consider it
as a better startegy than the current implementation ?).
If the consumer has been offline for a long time, then this discussion
is moot. No clients will be looking at it, so the risk of serving
out-of-date information to clients is zero. In that case, it doesn't
matter what strategy you use, they'll all work.
Another good point. It would require a use that send a hell lots of
updates while no client is having activity, and a disconnected consumer
- all three conditions at the same time - to face my scenario. Quite
rare. I had in mind this user who was updating his database with
millions of updates once in a while (say, once a year), during night,
and who find than teh consumer is not up and running in the morning.

Not sure then it worth the effort to find a way to mitigate such corner
case.
Post by Howard Chu
Post by Emmanuel Lécharny
For (1) I've been considering a stripped down memory-only version of
LMDB. There are plenty of existing memory-only Btree implementations
out there already though, if anyone has a favorite it would probably
save us some time to use an existing library. The Linux kernel has one
(lib/btree.c) but it's under GPL so we can't use it directly.
Q : do you need to keep the presentList ina BTree at all ?
Good question. We process it by doing a single search over the target
range, and removing presentlist entries for each entry returned by the
search. Since the search order is random, we want fast search access
to the presentlist.
We could alternatively do a dynamic array and walk the presentlist in
order, doing (entryUUID=x) searches on each element. The overhead of
doing X individual searches is worse than doing one global search though.
If the goal is to find all the entries that are not present in the DB,
wouldn't it be faster to simply quick sort the entryUUID we have
received? Both algorithms (AVL insertions and quickSort) are in O(n x
Log(n)) - if you except the possibility that the quicksort degenerates
in O(n2), of course - but Quicksort is faster than AVL when it comes to
order a set of values.

Loading...