Find Added To File (Part 2)

Last week I started talking about how to find the "added to file" value for documents in a database. This week I will finish the tip. You should read last week's tip first before starting with this one.

Last time I went over all the code in the Initialize subroutine. So I'll start with the WalkTree subroutine.

Sub WalkTree(db As NotesDatabase, node As NotesDOMNode, xmlHierarchy As Variant, _
doc As NotesDocument)
Dim child As NotesDOMNode
Dim elt As NotesDOMNode
Dim attrs As NotesDOMNamedNodeMap
Dim a As NotesDOMAttributeNode
Dim i As Integer
Dim xmlHierLevel As Integer
Dim unid As String
Dim addedToFile As Variant
Dim numAttributes As Integer
Dim numChildren As Integer

I define several types of variables that aren't "familiar" to every day programming. These are all used to go through the DOM hierarchy. Remember that we're looking at a tree with branches.

If Not node.IsNull Then
Select Case node.NodeType
Case DOMNODETYPE_DOCUMENT_NODE:
Set child = node.FirstChild
Dim numChildNodes As Integer
numChildNodes = node.NumberOfChildNodes
While numChildNodes > 0
Set child = child.NextSibling
numChildNodes = numChildNodes - 1
xmlHierLevel = Ubound(xmlHierarchy)
If xmlHierLevel <> 0 Or xmlHierarchy(0) <> "" Then
xmlHierLevel = xmlHierLevel+1
End If
Redim Preserve xmlHierarchy(xmlHierLevel)
xmlHierarchy(xmlHierLevel) = node.NodeName
Call walkTree(db, child, xmlHierarchy, doc)
If xmlHierLevel = 0 Then xmlHierarchy(0) = "" _
Else Redim Preserve xmlHierarchy(xmlHierLevel-1)
Wend

The first thing the subroutine does is make sure the passed-in node is something. If it's a null value, then nothing is going to happen and the function will exit normally. Assuming the node is not null, then I check the type of node. In the Domino Designer Help database, you can see all the different types of nodes possible. My code only deals with three types - the others are ignored. The first type I look at is the document node. This node has the potential to have the information we want. So I want to look at all the children nodes.

What I do is keep track of the hierarchy by expanding the xmlHierarchy variable and putting in the name of the node in the last position. This is passed along during the recursion so the recursive call knows "where it came from". The recursion happens on every child node for the document node.

How did I know what I'm looking for? What I actually did is started out with the example from the Domino Designer Help database. I used that example and printed out everything to a text file. I was then able to look at the file and find out what I was after and trace the hierarchy back through the text file and see what nodes I needed and what nodes I could ignore.

Case DOMNODETYPE_TEXT_NODE:
If Not doc Is Nothing Then
If Join(xmlHierarchy, "~") = "#document~database~document~noteinfo~addedtofile~datetime" Then
addedToFile = ConvertDateTime(node.NodeValue)
If Not doc Is Nothing Then
Call doc.ReplaceItemValue("AddedToFile", addedToFile)
Call doc.Save(True, False, True)
End If
Set doc = Nothing
End If
End If

The next type of node to look at is a plain text node. This is the type of node that contains the "Added To File" information that we want. But it's also used for other types of nodes. So the code looks at the hierarchy - if it started with a document, then went to the database, then to a document (which is actually going to happen all the time in this example because our initial collection only has documents), then to the note information (ie document properties), then to the "added to file" property, and lastly to the date/time value of that property, then that's the value we want to save. See how keeping track of the hierarchy helps out? There are a lot of times that the text node will be called, but there's only one specific instance where this code cares about it.

The value of the node is a standards-based date time format ("Coordinated Universal Time" or "UTC"). That needs to be converted to something Notes-specific. So the ConvertUniversalDateTime function (discussed here) will do that. That value is stored in the document object that has been passed around during the recursion. You haven't yet seen the code where that object is set - that will happen soon. The document object is cleared after the value has been stored in the document.

Case DOMNODETYPE_ELEMENT_NODE:
Set elt = node
numAttributes = elt.attributes.numberofentries
Set attrs = elt.Attributes
For i = 1 To numAttributes
Set a = attrs.GetItem(i)
If a.NodeName = "unid" Then
If Join(xmlHierarchy, "~") = "#document~database~document" Then
If node.NodeName = "noteinfo" Then
unid = a.NodeValue
On Error Resume Next
Set doc = db.GetDocumentByUNID(unid)
On Error Goto 0
If Err <> 0 Then
Err = 0
Set doc = Nothing
End If
End If
End If
End If
Next

Most nodes are element nodes, so this part of the Select statement will be called a lot. Inside here, we care about the UNID attribute. When that is found, the first thing I do is assure that the right path was used to get to this UNID attribute - that should be the main document, then the database, then a document. Because of how the Initialize subroutine was set up, this check could actually be skipped, but it's a good idea to include it. This attribute should be on the "noteInfo" node, so that check is made. Assuming the right path was used to get here, then I set the doc variable to that document (the one for this UNID). That document is passed through the recursion until the "added to file" value is found (the code block above) and set in the document.

numChildren = elt.NumberOfChildNodes
Set child = elt.FirstChild
While numChildren > 0
xmlHierLevel = Ubound(xmlHierarchy)
If xmlHierLevel <> 0 Or xmlHierarchy(0) <> "" Then
xmlHierLevel = xmlHierLevel+1
End If
Redim Preserve xmlHierarchy(xmlHierLevel)
xmlHierarchy(xmlHierLevel) = node.NodeName
Call walkTree(db, child, xmlHierarchy, doc)
If xmlHierLevel = 0 Then xmlHierarchy(0) = "" _
Else Redim Preserve xmlHierarchy(xmlHierLevel-1)
Set child = child.NextSibling
numChildren = numChildren - 1
Wend

Since the element node is the most popular, the children need to be processed. This starts the recursion in the same manner as was discussed earlier with the document node.

End Select
End If
End Sub

That ends the WalkTree subroutine. The End Select statement ends the choices for the type of node, and the End If statements ends the check for the node being null.

So, that's everything. You should now be able to run this agent against a certain database and every document will be updated with a field called AddedToFile with the date and time when the individual document was added to the current file. Obviously, if you're running this on multiple replicas you'll get different answers and could cause replication conflicts. So it might be better to simply report the information instead of actually updating the documents. It would be easy to modify this agent to write all the results to an agent log.

Breaking Par Consulting

exceeding expectations