Updated June 1, 2008

LIBR 202 Information Retrieval - dr. joanne twining
Welcome  / Greensheet / Class Schedule & Assignments / Grading / Blackboard


Assignment 1 - Refrigerator Database

This assignment has six elements: (Part A, Sections 1-4 and Part B, Sections 1-2) and includes Individual and Group Work.
The assignment elements must be accomplished sequentially, in collaboration with your group.
 
You will be assigned to a group and notified when your group workspace is ready. Group workspace(s) are in Blackboard. 

Total Available Points: 150.
Points will be assigned individually and to your group, as detailed.
Work Product Checklist


Refrigerator Object Database

The purpose of this two part, six step progressive assignment is to provide:

Getting  started:  

1. Download and install the DBTextWorks (Inmagic) database software from the SJSU LIS website. The password to access the restricted area is posted in our Blackboard "Technology" forum, and changes ever semester.

2. Complete the SJSU LIS online DBTextWorks tutorial as well as the tutorial packaged with the software, and the tutorial/extra materials available in the blackboard "Materials" section.

3. Scan the archives of the Blackboard discussion forum about inMagic and become familiar with the process of searching for an answer already provided.

4.  If you do not already have one, download, install, and learn to use an appropriate file compression utility for your personal computer (i.e. winzip).  See the "Materials Tab" in the blackboard for help.   Most of the files in this assignment must be submitted as a .zip file.  Unless otherwise indicated, all other files for this class must be submitted in .rtf (rich text format). 

The "objects" for the database we will build in this assignment 
are items in a personal refrigerator.

Grading for this assignment is largely based on the retrievable of your assignment files.  You must deposit the required file, properly named, in the proper place to receive points for the assignment;  if it is not, you will not.


Assignment Overview:

We will be dealing with two distinct concepts during this assignment: a data structure and its rules (the container, skeleton, or empty database without any data or records, along with the rules for entering data into that structure); and the data or records themselves (the contents) of the database.  Library information professionals design datastructures and rules, and enter the records into those structures in such a way as the records may be easily and efficiently searched and retrieved by system patrons, who likely have no knowledge of the structure and rules.   Accordingly, all work in this assignment is done with the USER in mind.

inMagic (also referred to as DBTextWorks) is a proprietary data retrieval software system commonly used in libraries and information centers around the world.  This is not an inMagic training class. 

In this assignment we are more interested in general database and retrieval concepts, rather than inMagic specifics, although you will have to learn and master those specifics in order to complete the assignments.  You are expected to learn the software on your own.  The general concepts will transfer to virtually any data retrieval system.  NOTE: inMagic uses a very specific collection of file extensions for it's data structure files and record files. For instance, a datastructure contains 11 files, and all those files must be present for the datastructure to function, and each file has a unique three letter file extension (such as .dmp, acf, etc.) This is one reason why we "zip" our datastructure work: to package all the required files together before transferring them.  The file in InMagic that contains only the records is the .dmp file.    Please do not change inMagic's  file extensions.  To post your assignments, simply select the appropriate inMagic files, and zip them into a single file using your file compression utility. Knowing how to zip and unzip files is a minimum technological requirement for admission to our program, and it is expected that you know how to do this.

Part A

Part A of this assignment has four sections.  Two are done as a group, and two are done individually. 

Overview: your group will work collaboratively, using your group's Blackboard discussion forum, group email, and group chat to design and test a prototype data structure and rules  for a collection of objects commonly found in a personal refrigerator.  Note the word PERSONAL.

A1  Your group will collaborate during design and construction of an inMagic prototype data structure and rules (assignment details) for items in a personal refrigerator.  After design and construction of your prototype, your group will deposit one zipped copy of your group's (empty) data structure and rules (rules must be saved in .rtf  [rich text format]) ...with no records... into your group's file exchange area, naming the file "prototype.zip"  (25 Points)

A2 After your group has designed and deposited your prototype, each member of your group will download your group's final prototype.zip file and test it by individually entering into it data (creating records) for ten objects from  your personal refrigerator, and according to the rules of the data structure.  You are expected to work together during this phase, discussing and perfecting your group's data structure and rules (your final work will be evaluated by another group in Part B of this assignment.)  Each student will then extract (export) their individual records from the group data structure, zip them, and individually deposit their database records (without the data structure or rules) into the group's file exchange area.  Name your file yourlastname_yourfirstname_records.zip (i.e. twining_joanne_records.zip) and name your post yourlastname_yourfirstname_records
(25 points) 

A3. Each member of your group will then retrieve all the records for your group and aggregate them by importing them into the group's datastructure.  Note:  the evaluation version of inMagic we are using has a 50 record limit.  If your group has more than 50 records, aggregate to the 50 record limit, but include some of each member's records in the aggregation.   Name your file yourlastname_yourfirstname_aggregate.zip  (i.e. twining_joanne_aggregate..zip) and deposit that file in your group's file exchange area, using the subject line yourlastname_yourfirstname_aggregate  (25 points) 

A4.  Finally, you will work together to create a final, perfected version of your data structure and rules, based on your experience from Parts 1-3 of this assignment.   One member of your group will deposit a zipped file containing  your group's final datastructure (with no records) including  the rules and purpose statement (in .rtf format) into the "Assignment 1B Evaluations"  Blackboard forum. This file will be evaluated by another group in Part B of this assignment.  Name your file yourgroupnumber_datastructure.zip (i.e. 4_datastructure.zip) and name your post yourgroupnumber_datastructure  (25 points)

Part B

In Part B of this assignment, members of your group will evaluate the Part A work of another group (group exchanges will be posted).

B1.    Your group will download and unzip your exchange group's data structure, purpose statement and rules. Each student will enter ten items from their personal refrigerator into that datastructure, and test that structure based on the purpose statement and rules.  Each group member will then extract their records, zip them, and deposit that  zipped file in the your group's file exchange area.  Name your file and your post:  yourlastname_yourfirstname_testother.zip  (25  points) 

 B2. Your group will use your group's blackboard discussion area, group email, and group chat to collaboratively evaluate your exchange group's datastructure and rules. We will use normative evaluation, which is explained in a separate post in our "Assignment 1B Evaluations" forum. The purpose is to provide the design group with critical feedback and suggestions.  (assignment details) Your group will work together to create a single evaluation document (saved in .rtf [rich text format]), which one member of your group will deposit into the  "Assignment 1B Evaluations"   Name your file "x_evaluation_y.rtf" where x is your group's number and y is your exchange groups number (i.e. 4_evaluation_5.rtf)   Name your post x_evaluation_y 
Read and, where appropriate, comment on each group's evaluation work. (25 points)


Part A, assignment details

In Part A of this assignment, your group will use Group Blackboard resources, group email, and group chat to

  1. define and describe a user group for your datastructure

  2. discuss and identify  the attributes of the objects found in any generic PERSONAL refrigerator,

  3. decide which attributes should be included in your IR system,

  4. decide what values should be allowed, and

  5. decide what rules you need to ensure consistency and accuracy in indexing. 

These decisions should be informed by the reading material related to description of information-bearing objects; you need to consider both the nature of the collection and the nature of potential use of the database.  

1.  Consider the group of people (or the individual) who will be the users of your collection. Who are they and why will they use your datastrucuture?  What attributes (fields) will they need to search on, and for what reasons?  What is the appropriate unit of analysis (a case of beer, for instance, or each individual can/bottle of beer)?   Will your attributes and values allow your users to both  aggregate the objects in meaningful ways and differentiate them from perhaps similar but irrelevant objects?  Remember the users of the database will not necessarily want to find a single known item (if they know what item they want, they don’t need to use the database);  rather, they will more often want to find all the items that have certain attributes.

 Ž        Write up a statement of purpose for the database; explain who your user group is and what information needs the database is intended to meet.

 2.  Establish the rules or standards for the content and structure of the record.  These should include:

·          the unit of description (unit of analysis)

·          a unique identifier for the record (usually a number that distinguishes each record in the database from all others unambiguously)

·          the fields each record will contain

·          which fields are mandatory (must contain data) and which are not

·          which fields are repeatable

·          Any definitions or explanations that will make it easier for your indexer to select the correct values. 

·          Any guidance as to when each value should be selected to help ensure that the surrogate will be an accurate representation of the original.

·          etc.

The rules or standards have the purpose of making the records, fields, and data consistent for purposes of retrieval.  This allows for some stability in the system, so it can be expanded and modified in a sensible, predictable way, and also lessens the possibility of ambiguous communication of the information in the database.  You may need a rule for each field (or you may not).  Ask yourself what mistakes an indexer could make in deciding what data to put in each field, and design the rules so that those mistakes won’t be made.  Ask yourself whether an indexer could look at your rules and your collection and create accurate records.

 Ž        Write up the rules.

3. Once you have collaboratively planned, documented, and created the preliminary data structure for your group database (you may each build the structure individually, or build one structure and share it in the group...but each group member is responsible for learning how to do this as you will do it on your own in Assignment 3), you will each move from the role of database designer to the role of indexer, and begin refining and beta testing your group's preliminary data structure. 

Working individually, use your group's preliminary data structure to create one record for each of ten items from your personal refrigerator, assigning the appropriate values for each field.  As you are entering your individual records, use the group forum and group email to discuss and refine your data structure, your purpose statement, and your rules.

Part B Assignment Details

In Part A, you worked in groups to design and create a database structure and write up its rules and standards. You also worked individually to enter ten actual refrigerator objects into your group's agreed-upon data structure, discussing the process and problems while refining your group's final data structure, purpose statement, and rules.  You uploaded your individual records; downloaded, aggregated and uploaded your group's records; and finally uploaded your group's final data structure (no data), purpose statement, and rules. These are all common activities performed daily in libraries all around the world, and the process by which libraries and information centers share records.

In Part B, you will beta test and evaluate the work done by another group.  Part B is designed to give you a sense of the work involved in field testing a datastructure. Your exchange group will deposit their data structure, purpose statement, and rules as a single zip file, as detailed.   You will then be assigned another group's work to evaluate. HOW we will evaluate each other's work is discussed in the "How to do an evaluation" message in our Assignment 1B Blackboard forum. You will individually download and unzip the other group's datastructure, read the rules, and enter ten items from your refrigerator to see how well their datastructure functions and feels, and how well their rules work in guiding you to create records.  You will each then extract/export, zip, and upload your individual files, as detailed above, into your group file exchange area. Then, working as a group and using your group's Blackboard space and group email, you will create a simple document that evaluate's the other group's design work.  This assignment helps prepare you for Assignment 3, in which you will be working individually, creating a datastructure for articles from the course supplmental readings, and with a pre and post-coordinate index. 

How to evaluate your exchange group's work: 

Evaluation:

Evaluate the fields, data, and rules; part of the evaluation should be based on how well you did indexing your refrigerator items into their data structure.  Discuss your experience with their design: what did you found clear? what did you had difficulty with? Share any suggestions.   Evaluate their data structure, statement of purpose, and rules on the basis of: 

·          their clarity

·          how well they provide for consistent description

·          how well they accommodate exceptions

·          how well they meet the purpose in terms of the intended user group

·          other considerations that occur to you

Ž       Working together, create a brief evaluation of the other group's data structure, rules, and statement of purpose based on your own analysis and your group's discussion and deposit into the General Blackboard  discussion forum, "Assignment 1B Evaluations" using the instructions above.  Read other groups' evaluations and, where appropriate, comment.

Work product checklist:

(a total of 200 points is available for assignment 1)

Each group's file exchange area will contain the following files:

A.1  prototype.zip (one for the group) containing your group's collaboratively-designed prototype data structure, user statement and rules [.rtf format]. (50 points for group)

A2. yourlastname_yourfirstname_records.zip (one for each student) containing extracted records for ten items from your personal refrigerator. (25 points per student)

A3. yourlastname_yourfirstname_aggregate.zip (one for each student) containing 50 aggregated records from group.  (25 points per student)

B1. yourlastname_yourfirstname_testother.zip (one for each student) containing records only for ten items entered into exchange group data structure. (25 points per student)

One person from each group will deposit in the "Assignment 1B - Evaluations" Forum:

A4. yourgroupnumber_datastructure.zip containing your group's final data structure and rules, for use by another group for evaluation during Part B.  (25 points for group)

B2.  x_evaluation_y.rtf  (where x is your group number and y is your exchange group number), a group evaluation of the exchange group's data structure and rules. (50 points for group).

*A “good” collection:

ü       Has objects that are similar enough that the same set of fields will work for all of them, but different enough that you will have different values for the fields.  (Note that it’s possible that an object won’t have a particular attribute; in this case, your values will need to include a “not applicable” or “none” term.)

 ü       Has enough attributes that you can come up with about a dozen fields. 

ü       Has some hard fields and some easy fields

o        “Easy” = value can be ascertained easily by the indexer, little ambiguity or room for error

o        “Hard” = Indexer can’t take the value straightforwardly from some written source.  A rule has to be created to ensure that the value is entered correctly.  The challenge here is to you as the database designer to come up with a rule that is so clear that the indexer will be able to do it correctly.  The point isn’t to trick your indexers, but rather to come up with something that's a real challenge to write a good rule for, and to write such a good rule that your indexers will be able to follow it.

o        I’d rather see you try a hard field and fail (and write about why in the Discussion section) than stick only with safe, easy fields (which may not turn out to be as easy as one would think, anyway)

LIBR 202 Information Retrieval - dr. joanne twining
Welcome  / Greensheet / Class Schedule & Assignments / Grading / Blackboard