2.1.14 DAV Searching and Locating (dasl) bof

Current Meeting Report

DASL BOF Meeting Minutes
42nd IETF, Chicago
August 25, 1998

Reported by Dale Lowry, edited by Jim Davis.

The meeting began with an introduction by Alex Hopman

Alex reviewed the charter, which is unchanged since the LA IETF. There was
concensus in the room that it was appropriate. Next he reviewed the DASL schedule.
DASL is on schedule. The next milestones are:

- Oct 1998: Create final version of DASL requirements document. Submit as
informational RFC.
- Dec 1998: Meet at Orlando IETF to refine protocol document March 1999
complete revisions to DASL
- The METAD group is doing similar work that we want to leverage.

Next, Jim Davis presented scenarios that motivate the uses of DASL, and the
functional requirements that follow from these scenarios:

Scenarios

1) Property search

Example: find all image/jpeg resources created in May, 1998 by Ansel Adams. This
query tests two DAV properties and one user-defined one.

DAV defined properties to search include content type, resource type, date created or
modified, and language)

Question to the audience: How important is it to be able search for structured value
properties?

Reaction: Some people would like this functionality.

It might be useful for some DAV-defined properties (e.g. lockdiscovery), or when
querying against in user defined properties that have list (or set) values, e.g. one might
want to locate documents written by both "Michael Jordan" and "Dennis Rodman".
Note that the object model of DAV is *not* XML, XML is only one possible wire
transport for the model. Thus this example shows only how the property might look
on the wire, and says nothing about how it might be stored.

<author>
<person>Jordan, Michael</person>
<person>Rodman, Dennis</person>
<person>Pippen, Scotty</person>
</author>

2) content search

Example: Locate all texts containing "efficient car"

In such searches, hit highlighting is often useful. Hit highlighting displays those
segments of resource that contained the hit, thus allowing one to distinguish between e.g.

"... Ford's new fuel-efficient car..." and
"... to make Lisp efficient, car is implemented by a single instruction..."

Issue: Unlike, e.g. property search, there is no single widely accepted model for the
semantics of content search. Search engines differ *widely* in their semantics, e.g. in
what operators they support, how they tokenize input and treat punctuation, etc.

Issue: for search of HTML documents, it would be desirable for the search engine to
understand the HTML structure, e.g. to search for text only within certain tags.

3) Site Navigation

Examples:
- Locate resources locked more than seven days
- Locate resources unchanged this year

Suggestion: finding things with certain security constraints currently set:

[Editors note: DASL will consider queries on ACL after DAV defines CL, but not before.]

4) Search Options
- size and time limits on results
- partial results (when server finds too many)
- depth in scope
- ordering results
- paged results

Requirements
Requirements use the following terminology:
- result set: one record for each resource that matches criteria
- result record: set of properties for each resource in the result
- scope: set of resources to be searched
- criteria: determines membership in result set
- result record definition: set of properties in the result record
- sort specification: order of records
- search attributes: other instructions, e.g., max size of result

[Editors note: The list below shows the requirements as presented, with annotations
and additions made during the meeting. Note that we intend to soon publish a revised
internet draft with a different numbering scheme.]

All requirements are to be considered to be prefaced with either "It must be possible
to ..." or "It is desirable that it be possible to ...".

- S: Scope
- S: specify a number of distinct, unrelated resources in the scope
- S2: specify a collection as scope
- S2.1: specify the depth of a scope
- S3" specify a scope within a resource
- Issue: Should we specify a scope smaller than a single resource? No one seemed
to care.

- C: Criteria
- C1: search both properties and content in single query
- C2: combine criteria with boolean operators
- C3: support undefined properties

Question: Will we have derived properties? e.g., count, sum, average,e.g. find
documents with two or more authors.

Answer: anything you can PROPFIND for you can search for in DASL. This is up to
the server to support. This doesn't always mean that the property is stored. DASL
won't explicitly define inheritance, aggregation, etc...

- C4: compare property values to
- C4.1: constant values
- C4.2: other properties

Question: do you mean properties of the same resource or of other resources? The
latter might require a join.

- C4.3: expressions
- C4.4: regular expression

- C5: operators
- C5.1: equality
- C5.2: with relative operators

- C6: existance
- C7: specify case-sensitivity
- C8: specify national sorting order

Comment: Should be able to specify relative performance of different search
mechanism. May come across cases where there is performance issues, e.g. client
might want to know whether a given property is indexed or not. [see below]

- C9: search content of any text media type
- C10: specify searches using
- C10.1: like
- C10.2: contains
- C10.3: near
- C10.4: in

Question: Regular expressions?

Question: Is it a requirement that a given implementation against the same data return
the same results on the exact same query? This isn't a requirement yet and may not
be possible.

- C11: specify
- stemming
- phonetic
- keyword
- case sensitivity

- R: Results
- R1: specify the maximum size of results set
- R2: set of properties in result
- R2: sort order

- O: Other
- O1: extensibility
- multiple grammers
- optional operators
- O2 request paged results
- O3 redirected query
- O4 hit highliting

Comment: Client may want to prevent server from redirecting, for privacy/security
reasons.

- D: Discovery
- D1: discover grammers
- D2: discover operators
- D3: discover scope information -- searchable properties, indexing

Question: Do we want to discover redirection capabilities?

Finally, Saveen Reddy briefly presented an overview of the protocol, showing some
example queries. In the short time available, he did not attempt to cover the full
protocol.

Question to attendees: Should we have a working group for DASL? There was a rough
concensus that this was appropriate.

Slides

DAV Simplesearch Grammar

Attendees List

go to list