Any metrics on items per minute?

Are there any general metrics around the number of items that can - on average - be pulled across the interface with Millennium? Realizing that it depends on a number of factors -- I'm just looking for a general ballpark figure. Is it thousands per minute?

Todd Tomlinson
Managing Partner
Aha Consulting

Todd,
It does, indeed, depend on a number of factors. The primary factor being whether you configure Locum to use pcntl or not. And if you do configure Locum to use it, how many child processes you let it run. Out of the box, the sample configuration file is set to 10 which is what we use here in Darien.

The other major factor in harvest speed is whether you choose to use Syndetics and/or Amazon for cover images. Amazon is very fast, but Syndetics image URL harvesting can make your harvest take up to 15X longer. That's because 1) Syndetics is horribly slow and 2) the Syndetics API may indicate that a cover image exists for an ISBN, but I found that in many, many cases, their cover image is just a 1x1 GIF. So if you enable Syndetics, the harvester downloads the cover image and analyzes its dimensions so that it can weed out all bad ones. That adds a LOT of time.

Beyond that, the harvest speed depends on how well your Innovative server performs. To give you an idea of performance on our servers, we have a older DEC Alpha in the III side and we can pull over 100K-150K records per hour (without covers). We'll be upgrading our hardware soon and I'll be interested to see how performance improves.