Friday, March 28, 2008

Documentum System Administration Training

I was pleased to wrap up my week of training with 2 days of Documentum System Administration training. This is the basic training. As you'll notice in my notes, there are some things that are only covered in the advanced class. Such is life.

This post will cover both days. I found myself taking fewer notes in this class as it was more interactive with fewer students. And the labs were more interesting. Also, the slides pretty much cover the details you need to know. As with Livelink administration, I noticed this is something you get trained for. But, you probably learn a lot more by doing and troubleshooting over time.

Again, these are mostly just raw notes. Not intended as entertaining literature :-) One thing about these notes is that this class covered Documentum version 6. There are a lot of comparisons to Livelink here as that's the world I come from and understand as it pertains to ECM. Of course, I've also tinkered with Alfresco but nothing serious (yet).

Day 1 of Administrator Training.

Add on services. Collaboration Services does chat, forums, portal, etc. Do we have this? Yes. It turns out to be so. After asking this question, I found out there are quite a bit of differences between eRoom and Collaboration Services. Not something I will go into here. But, suffice it to say that we will most likely be focusing on the latter possible.

LDAP integration. I noticed that Notes LDAP is not listed in the supported list of LDAP servers. Yes, we use Notes where I work. I know. I know.

During our lab, our Reassign user job wouldn't run today. We fixed it with this DQL query:

Then restart your Docbase.

Language packs are very easy to do. I can't believe we spent so much time worrying about this in the evaluation process.

Jobs - the Window interval is worth noting. Setting it to 60 means it can ONLY be run 60 minutes either side of it's scheduled time. If you want to enable a job to be able to be run anytime the number is weird. It should be 720 but it's not. it's 1440. I don't care what they say. That is an odd implementation of a job scheduler.

Day 2 of Administrator Training. Happy Friday.

We discussed distributed repositories. See distributed repositories reference. We didn't go too deep into content replication as that's in the advanced administrator course.

BOCS (Branch Office Caching Services) is a standalone ACS server. It's like a cache server. Not replicating (except maybe initial pre-pop). It lightweight. It's not ACS (you can't run jobs and stuff). Easy to administer, etc. It might be a good solution for remote sites. Requires web-based Documentum clients.

ACS/BOCS is read only by default? But you can make it writable. It's configurable.

Regarding failover, Documentum doesn't need to know about it. There's only one Content Server. With repository configurations, you have to tell each connection broker about the other servers using proximity values. If the server is prefixed with 0, it is used for data requests. And, servers prefixed with 9 (in the thousands position), it's used for content requests. And, there could be multiple of each. It goes to the closest one by number.

The proximity values use a thousands number: ####. But, you need to split that out as the first number and then the next 3. For example, 0100 is 0 and 100. And, 9001 is 9 and 001. So, a server for data requests with a proximity of 100 and a server for content requests with a proximity of 100, respectively.

Failover support - in this class, we are primarily talking about Content Server failover support not the application server failover support which is different.

Multiple repositories (docbases). You don't want just one usually as different groups within your organization may want their own. You can use objection replication. Not content replication like in remote location/caching scenarios. Instead, you are replicating objects. Important to keep track of which one is the original. If you try to check out an object, it must connect to the original and check that one out then you can edit the replica. If someone does something on the original server it doesn't update replicas immediately. It gets scheduled to replicate. See object replication jobs.

Interesting you can also configure it in such a way that you replicate objects manually. You could export it from one repository and burn to a CD. Example: movie studio, big big files. You don't want to push this stuff across the network.

When handling multiple repositories you need to have a repository federation in order to observe user/group/permissions across repositories as each repository has their own permissions and different sets of users.

One repository must be named as the governing repository. Superuser priv's not replicated. That's good. Setting up a federation is pretty easy. You create an object and define the governing repository and a job to do the replication.

For object replication each job is only representing a cabinet or a folder.

Storage areas are neat. You can actually specify whether or not to leave the extensions on. In Livelink this is not the case. Extensions are not included in LL.

Interesting choices on storage areas. I hope nobody asks for anything crazy. Blobs are always a bad idea. However, the Content Addressing option (forces you use one of their Centera devices) could be an option for some types of content, especially email. The reason is we need to archive that stuff. And, it's read only anyway. So, there's value there.

Records migration jobs. Interesting way to maintain your storage areas. I suppose you'd do this if disk space was more critical. I never had to do anything like this with Livelink, b/c I made sure I always had ample disk space. Keep it simple.

dm_ContentWarning - tell us if the drive's getting full. setup a job to do this..
dm_DMClean cleans up oprhaned files and aborter wf's, etc. Like an empty trash function.
dm_DMFilescan - looks at the filesystem then looks in the repository. That's different. This is more useful in a distributed environment where you are using content replication. So, you need both in a complex architecture.
see also dm_RenditionMgt and dm_VersionMgt and dmRemoveExpiredRetnObjects.

It's amazing how transparent they are about their system. I don't remember OpenText being this open about how everything works. Often it seemed in LL they had options in the UI that they told you never to touch without talking to support. The Documentum folks seem to take a different approach. They WANT you know what everything does as much as possible. Of course, they don't cover everything in this course. That's why they have an advanced admin course. Who knows. Maybe the OpenText culture has changed a little in the last 2 years?

Logging. Well, the logs aren't all in one place. There are lots of places to look. Reminiscent of my experience administering Oracle AS. That's what happens when you buy other software and try to merge it with your own. :-) But, it looks pretty reasonable. They are using Log4J. w00t. Wow. You can set filter DFC tracing to only look at a single user.

Consistency Checker. This is great. Livelink has something like this as well. Every system should have this!

Search. They used to use Verity before v5.3. Didn't scale as well. Now they use an Index Agent/Index Server package. Here we go. Index administration. More and more with Livelink administration, this became a large part of your responsibility. Search must ALWAYS be working and there are a lot of configuration options. It will be interesting to see how difficult this is to manage with Documentum.

Full text software has to be installed separately. Waa? Who wouldn't want their data full text indexed these days? Whatever. Must be for the small group option.

Wow. This Indexer reserves the next 4000 ports after the one you assign. Beastly. No wonder they insist you run it on a separate box from the Content Server.

When you run the installer for the indexer, one of the screens displays prerequisite information. And one of those items is that they do NOT support VM's for this. Thank GOD. I've had more troubles with VM's for certain software where I work than you might imagine. So, it will be nice to just use that screenshot. The thing is that VM's are not good enough to handle disk intensive systems like Index Servers.

Interesting architecture note: each Index Server uses a self-contained copy of Tomcat because it might be installed on a separate box.

By default, everything (dm_sysobject) is indexed. If you want to selectively manage that, you turn it off and explicitly name them. Livelink was like the latter. It will be interesting to see if there are any problem document types. Every now and again in Livelink, some documents would cause trouble. In fact, there were sometimes mystery documents that simply would not index. In which case, you could exclude them by node ID. Subtypes of a parent use their indexing property. If the parent is indexed, so are the children types.

There's also a self-paced 1 day course on index administration if we want to learn a little more about search. They are not free, but they are cheaper. "Online" means an online class like this one. "Self-paced" is different.

Upgrading
Run the Consistency Checker prior to upgrade. Man, that is right out of the Livelink book. Do I know that's true :-)

References to have

  • Server Fundamentals Guide (the bible)

  • Server Administration Guide



The Advanced class gets into performance tuning.
There is a System Administration certification. It's only a couple hundred bucks. You can take it multiple times if you need to. You have to pay again. WDK course is a 2 day course and might also be useful.

Overall. My training was quite good. The trainer gets good marks. I took these classes from MicroTek down in the Financial District.

That's it for my training. But, word is I may get some more soon. DFC. Look out.

No comments: