Dublin Core

Announcement for MARC4J Publication

Crosswalking is a concise book for library programmers who want to learn to use MARC4J to process bibliographic data. MARC4J is an open source software library for working with MARC, MARCXML and related bibliographic standards in Java. The library is designed to bridge the gap between MARC and XML.

It is divided into the following chapters:

Chapter 1, Reading Data
Chapter 1 provides a short introduction about MARC formats and then focuses on reading MARC and MARCXML data. This chapter also explains how to create and update records and it demonstrates how to pre-process the input to convert MODS to MARC.
Chapter 2, Writing Data
Chapter 2 concentrates on the details of writing MARC and MARCXML data and how to post-process the output to convert MARC to MODS.
Chapter 3, MARC4J and JAXP
Chapter 3 explores integration with some important Java XML API's including JAXP, SAX and DOM. It demonstrates how to write the result to a DOM document, how to format XML output using a dedicated XML serializer, how to build pipelines using XSLT and how to use the SAX interface as an alternative to XSLT.
Chapter 4, Indexing with Lucene
Chapter 4 concentrates on indexing and searching MARC data with Apache Lucene using the MARC4J Lucene API.
Chapter 5, Putting It All Together
Chapter 5 focuses on building an SRU Search/Retrieve Web application using the various MARC4J interfaces and classes to process MARC data and using Lucene for indexing and searching.
Appendix A, MARC4J API Summary
Appendix A provides a summary of the core MARC4J interfaces and classes.
Appendix B, Command-line Reference
Appendix B documents the command-line programs included in the MARC4J API.

This book provides useful information for both developers learning about MARC4J for the first time and developers returning for reference and more advanced material. The chapters provide many reusable examples, while appendixes provide a reference to the API and the command-line utilities.

Crosswalking is published through lulu.com.

Visit lulu.com for more information.

MARC4J Lucene API 0.1

A new software library is available from the MARC4J project Website (http://marc4j.tigris.org). The MARC4J Lucene API provides an easy to use and easy to configure utility for creating Lucene indexes based on MARC or MARCXML. Lucene is an open source text search engine library written in Java.

By default the library uses an index context based on the MARC to Dublin Core crosswalk, but users can create an index configuration using a simple XML format. It is also possible to store the full MARC record as binary content. A command-line utility is added to enable the creation of the Lucene index without the need to write code. The following command, for example, adds the MARC records in input.mrc to an existing Lucene index using the given index schema:

java org.marc4j.lucene.util.MarcIndexDriver -index /home/index
-schema file:///home/schema.xml input.mrc

The library can be downloaded from the Documents and files section of the MARC4J project page at http://marc4j.tigris.org. Look for a folder called marc4j-lucene. The library is published under the LGPL license.

Greenstone-2.53

Katherine writes: "The Windows, GNU/Linux, and Source distributions of Greenstone v2.53 are now available for download. Important changes in this release include, in no particular order: a brand new installer; much improved GLI compatibility with Java 1.5.0; the ability to import documents exported from DSpace, and vice versa; a smarter HTMLPlug that blocks the images in the HTML files it processes, and no others; new GLI metadata sets: Qualified Dublin Core, NZGLS, AGLS, and RFC 1807; Lucene building support (for real this time!); an improved and much more bandwidth-efficient GLI applet; support for subfields in the Greenstone Editor for Metadata Sets (GEMS); and many many other improvements and bug fixes."

VDC-1.0 (Stable)

VDC writes: "The Virtual Data Center (VDC) is an OSS digital library system "in a box" for numeric data. VDC provides a complete open-source, digital library system for the management, dissemination, exchange, and citation of virtual collections of quantitative data. The VDC functionality provides everything necessary to maintain and disseminate an individual collection of research studies: including facilities for the storage, archiving, cataloging, translation, and dissemination of each collection. On-line analysis is provided, powered by the R Statistical environment. The system provides extensive support for distributed and federated collections including: location-independent naming of objects, distributed authentication and access control, federated metadata harvesting, remote repository caching, and distributed virtual collections of remote objects. Release 1.0 of the VDC provides RPM's for Redhat 9.0, Fedora 1 Core and Redhat 3 Advance Server. Supported standards and protocols and formats include: DDI, Dublin Core, and MARC for metadata; R,SPSS, SAS,ASCII, and STATA for data ; OAI and Z39.50 for queries; UNF's and Handles for naming/citation."

DSpace-1.1

a little birdie yim'd me that DSpace version 1.1 (inc. docs) is available. From the sf news item: "New features include Advanced Search capability with fielded searching, better Unicode support, improved performance and lots of other good stuff. And bug fixes. Lots of bug fixes."

Scout Portal Toolkit-0.9.7

David wrote to oss4lib-discuss: "Internet Scout Project would like to invite potential portal developers to investigate the Scout Portal Toolkit (SPT). This open source software package, funded by the Mellon Foundation, allows groups or organizations to develop a portal online without making a big investment in technical resources or expertise. The current SPT package features include: a metadata field editor, which allows portal administrators the ability to add, delete, or disable a variety of metadata fields; a Dublin Core compliant metadata field set; cross-field Searching; resource annotations and quality ratings by users; intelligent user agents; suggested resource referrals (recommender system); accessibility for users with disabilities; and RSS channel export."

Greenstone 2.35

Gordon writes: "Version 2.35 of Greenstone adds many plugin enhancements, including bzip2 support, improved PDF and Word document handling, an updated ImagePlug (Unix only) and optional XML files for adding extra metadata. A Russian translation of the user interface has been added. Documents are now stored as XML. Greenstone now works on MacOS X (or any POSIX system), and the new CORBA interface gives other programs access Greenstone collections."

Syndicate content