1 Reply Latest reply: Jul 16, 2013 10:36 AM by JimKlimov RSS

    WABP contact import produces duplicates, how to avoid?


      Hello, now that forums are back online, I'd like to follow on to my older question about import of Contacts data into user accounts of OUCS 7u2 (started in the Calendar thread CLI tools to import Calendar and AddressBook entries into multiple users? where I got many helpful replies for the Calendar part of the quest).


      Lacking (or not having heard of) any "official" command-line tools to manipulate users' address books, we decided to make a script which does the same via HTTP(S). Namely, it logs in to Convergence with proxyauth into the user we needed to get a token id (alas, /wcap?httpauth=1 trick does not suffice for IWC auth), accesses its address book to have one automatically created if absent, extracts the "Personal Address Book" id (can also get one from LDAP once the book is created), and posts new entries into the identified address book (some path like /iwc/svc/wabp or similar) using the VCARD-formatted file that we extracted from older messaging system. So far so good, except that that OUCS deployment is still in POC phase preparing for migration, and if we rerun the process to update user accounts on the new box with information from old system, then duplicate contacts entries appear (unlike, for example, calendar import with wcap - which apparently replaces old identical entries with newly imported ones).


      Currently we are making a work-around that would verify that similar-seeming entries are not present in LDAP (by name, phone1 or mail1) and only post the VCARD stream that would have unique entries; is there any flag for the server side to either avoid creation of duplicate entries or replace old entries, or to post-process the address books and remove dupes?


      Possibly, we are using sub-optilmal tools, i.e. are there non-Covergence HTTP endpoints for addressbook, supported command-line tools, etc.?



      //Jim Klimov

        • 1. Re: WABP contact import produces duplicates, how to avoid?

          My script suite to import various data types into an OUCS server is nearly complete - at least it works for many data samples.

          The Contacts duplicates are after all checked by "brute force" - verifying presence and value of entries in the user's existing PAB, and not-posting entries from provided VCARD input file to Convergence, accordingly. There are cases when this fails, and a few bugs to iron out, but hopefully this can even become productized some day


          New question arose: it seeems that when VCARD markup comes in wrapped lines (like LDIF, ICS and others - if an attribute value is long, it is wrapped at some length limit, with subsequent lines for an attribute starting with a space) and this is fed into Convergence import.wabp, the resulting imported values in the PiServerDB LDAP entries have an embedded line-break - i.e. in the middle of a long name for a "properly" single-valued displayName, etc.


          Is there some format mismatch between the standards, the exporting tools we used (DavMail mostly, and our parsing scripts) and the Convergence WABP importer?


          Does the WABP importer have any practical line-length limit? We can parse the markup to remove line breaks for such attributes, and feed Convergence some really long unbroken lines as a result - what would happen? in fact, I'll try that soon...


          Thanks for any new ideas, and for old help,

          //Jim Klimov