05-04-2007 12:17 PM
It's been almost 75 mins since the last update in the restore log. Is there something going wrong with this restore (from one LMS 2.6 server to another)?
tail -f /var/adm/CSCOpx/log/migration.log
04/May/2007 15:01:29:695 208911 [main] INFO com.cisco.nm.rmeng.migration.MigrMain 135 - Unmapping 717@nms of RME from PIDM
04/May/2007 15:01:29:700 208916 [main] INFO com.cisco.nm.rmeng.migration.MigrMain 160 - UpdatePIDMEntries are done
04/May/2007 15:01:29:701 208917 [main] INFO com.cisco.nm.rmeng.migration.MigrMain 161 - UpdatePIDMEntries are done got to
04/May/2007 15:01:29:701 208917 [main] INFO com.cisco.nm.rmeng.migration.MigrMain 70 - Exiting now... from do mig
04/May/2007 15:01:29:701 208917 [main] INFO com.cisco.nm.rmeng.migration.MigrMain 71 - Exiting doMigration
04/May/2007 15:01:29:702 208918 [main] DEBUG com.cisco.nm.rmeng.migration.MigrMain ? - Calling postMigration() of class: com.cisco.nm.rmeng.migration.DcmaDataMigrator@1ff7a1e
04/May/2007 15:01:29:702 208918 [main] DEBUG com.cisco.nm.rmeng.migration.MigrMain ? - Calling postMigration() of class: com.cisco.nm.rmeng.sa.migration.SaDataMigrator@d19bc8
04/May/2007 15:01:29:703 208919 [main] DEBUG com.cisco.nm.rmeng.migration.MigrMain 291 - SaDataMigrator::postMigration() - No post migration to be done.
04/May/2007 15:01:29:703 208919 [main] DEBUG com.cisco.nm.rmeng.migration.MigrMain ? - Calling postMigration() of class: com.cisco.nm.rmeng.migration.SWIMDataMigrator@1630ab9
04/May/2007 15:01:29:704 208920 [main] DEBUG com.cisco.nm.rmeng.migration.MigrMain ? - Calling postMigration() of class: com.cisco.nm.rmeng.migration.RMEDataHandler@1551f60
Solved! Go to Solution.
05-10-2007 12:20 PM
Ah-ha! Found it. NMSROOT is reset by the fact that restorebackup.pl uses Perl. Instead of being the non-canonical /opt/CSCOpx, it is /product/CSCO/CSCOpx. Since dmgtd.sol always runs as /opt/CSCOpx/objects/dmgt/dmgtd.sol, the dmgtd script can't find it.
If you modify /etc/init.d/dmgtd and change line 30 to:
if echo ${i} | /usr/bin/cut -d, -f2 | /usr/bin/egrep "^/opt/CSCOpx/objects/dmgt/${DMGRFILE}$" >/dev/null 2>&1 ; then
This problem will go away. I don't like using hardcoded values like, but I think it's safe in this particular case. Sorry for the mess.
05-04-2007 12:26 PM
This is the migration.log. Check the restore log to see if there's anything happening in there. At this point, I wouldn't be too worried. The migration will take time (could be multiple hours). The major bugs with migration taking days should all be fixed in LMS 2.6, and I see no indication of problems in this log snippet.
05-04-2007 01:40 PM
tail -f restorebackup.log
[Fri May 4 14:57:57 2007] Value of PX_DATA_DIR is: /var/adm/CSCOpx/files
[Fri May 4 14:57:57 2007] CMF Version is : 3.0.5
[Fri May 4 14:57:57 2007] No need to update the incremental ccr
[Fri May 4 14:57:57 2007] No need to update the ccr
[Fri May 4 14:57:57 2007] This is Restore
[Fri May 4 14:57:57 2007] Restoration of the RME data is in progress...
[Fri May 4 14:57:57 2007]
[Fri May 4 14:57:57 2007] See the log file /var/adm/CSCOpx/log/migration.log for progress...
Daemon Management is not running.
INFO : Stopping DBEngine processes registered to Daemon Manager
Just curious: Does LMS 2.5+ implement more/better compression for the backup archive? My LMS 2.2 backup used to eat up >7GB, while LMS 2.6 now only consumes ~1.9GB, though the latter does need >6GB temp space to perform the restore.
05-04-2007 02:10 PM
Nothing is different in terms of compression. After a migration, the resulting data may actually be smaller since the RME database is not restored as-is, but rather the data is transfered into the new database.
05-05-2007 07:35 AM
It started at 14:14 yesterday and is still not finished:
ps -ef | grep restore
casuser 4743 4382 0 11:29:53 pts/2 0:00 grep restore
root 23562 384 0 14:14:03 pts/9 0:07 /opt/CSCOpx/objects/perl/bin/perl /opt/CSCOpx/bin/restorebackup.pl -d /product/
-rw-r--r-- 1 root other 6102085 May 4 15:01 migration.log
-rw-r--r-- 1 casuser casusers 1560530 May 4 15:01 restorebackup.log
-rw-r----- 1 casuser casusers 36062 May 4 15:07 CampusOGSServer.log
-rw-rw---- 1 casuser casusers 1730241 May 4 15:07 pidm.log
-rw-rw---- 1 casuser casusers 444056 May 4 15:08 license.log
-rw-r----- 1 casuser casusers 3002 May 4 15:33 RMEDbMonitor.log
-rw-r----- 1 casuser casusers 3002 May 4 15:33 CmfDbMonitor.log
-rw-rw---- 1 casuser casusers 34329 May 4 15:33 CTMJrmServer.log
-rw-rw---- 1 casuser casusers 402895 May 4 15:34 NCTemplateMgr.log
-rw-rw---- 1 root casusers 30787 May 4 15:34 rme_base.log
-rw-rw---- 1 root casusers 8021736 May 4 15:34 rme_ctm.log
-rw-rw---- 1 casuser casusers 5438912 May 4 15:34 IC_Server.log
-rw-rw---- 1 casuser casusers 74597 May 4 15:34 daemons.log
-rw-rw---- 1 casuser casusers 161318 May 4 15:34 RMECSTMServerlog
-rw-rw---- 1 casuser casusers 1650290 May 4 15:34 dmgtd.log
-rw-rw---- 1 casuser casusers 1839918 May 5 03:04 dbbackup.log
-rw-rw---- 1 casuser casusers 11458 May 5 11:27 diskWatcher.log
-rw-rw---- 1 casuser casusers 9197825 May 5 11:28 dmgtDbg.log
Two points of curiosity: 1) The nightly backup ran as scheduled at 3:05 this morning. The archive comes out to 280MB, compared to ~1.9GB on the other server. I don't know it backed up what's originally on this server, or the other one's db which I was trying to restore here. 2) dmgtDbg.log has gotten a lot more chatty ever since 15:01 yesterday, logging every minute, right after migration.log stoped updating:
[Fri May 04 14:58:40 EDT 2007][dmgrDbg][dmgrDbg] getenv(PX_DBG_LEVEL)=NULL
[Fri May 04 14:58:40 EDT 2007][dmgrDbg]##### INFO ##### re-evaluate DbgLevel=0x0
[Fri May 04 15:01:24 EDT 2007][dmgrDbg][dmgrDbg] getenv(PX_DBG)=NULL
[Fri May 04 15:01:24 EDT 2007][dmgrDbg][dmgrDbg] getenv(PX_MY_DEBUG)=NULL
[Fri May 04 15:01:24 EDT 2007][dmgrDbg][dmgrDbg] getenv(PX_MY_TRACE)=NULL
[Fri May 04 15:01:24 EDT 2007][dmgrDbg][dmgrDbg] getenv(PX_DBG_LEVEL)=NULL
[Fri May 04 15:01:24 EDT 2007][dmgrDbg]##### INFO ##### re-evaluate DbgLevel=0x0
[Fri May 04 15:01:24 EDT 2007][dmgrDbg][dmgrDbg] getenv(PX_DBG)=NULL
[Fri May 04 15:01:24 EDT 2007][dmgrDbg][dmgrDbg] getenv(PX_MY_DEBUG)=NULL
[Fri May 04 15:01:24 EDT 2007][dmgrDbg][dmgrDbg] getenv(PX_MY_TRACE)=NULL
[Fri May 04 15:01:24 EDT 2007][dmgrDbg][dmgrDbg] getenv(PX_DBG_LEVEL)=NULL
05-05-2007 10:06 AM
Your migration may be failing due to the symlinks you said you were using previously. Doing a backup of this partially migrated data will not yield any usable results, so I wouldn't trust that backup. The full migration.log will be needed to see if using symlinks is the problem.
05-05-2007 10:14 AM
I'd also like to know if dmgtd is running at this point? The restore log says it is not, but it looks like the script may be stuck in a loop trying to shutdown the database servers. If dmgtd is still running, it may be trying to respawn these processes.
Please provide the full output of ps -efl.
05-08-2007 07:32 AM
Could you take a look at TAC case 605960929 that I just opened for this issue? Thanks.
05-08-2007 07:55 AM
I assume this is still running now, and this ps output is current. What is the cksum on /etc/init.d/dmgtd? I want to make sure you have the correct version of my patch. It looks like the dmgtd script is not seeing the dmgtd.sol process. Also, if you run /etc/init.d/dmgtd stop now, does it also report that dmgtd is not running?
05-08-2007 08:01 AM
cksum dmgtd
333832757 23182 dmgtd
/etc/init.d/dmgtd stop
Daemon Management stopping. This may take a few minutes.
BTW, the only symlink in use is NMSROOT (/opt/CSCOpx/) pointing to /product/CSCO/CSCOpx/, but I think that's supported.
05-08-2007 08:05 AM
That's the right patch, and that's what you should see when dmgtd goes down. With dmgtd down, can you try the restore again? If it hangs again, I'd like to see the new ps output, and we may need to get a truss.
05-08-2007 08:12 AM
Actually the restorebackup.pl script did prompt me to perform "/etc/init.d/dmgtd stop" before it could start, so I suppose it'd probably just hang again. I'll go ahead give it another try and see what happens.
05-08-2007 03:11 PM
Looks like this second restore is going down the exact same path as the first one. I should note that the restore target server has two LMS patches the restore source does not: the patched dmgtd for CSCsi74887 and the fix for CSCsg26840 user must be Network Admin to use CiscoView. Would these discrepancies cause this restore failure?
ps -ef | grep restore
casuser 18973 16548 0 19:03:08 pts/2 0:00 grep restore
root 28868 27578 0 14:37:38 pts/4 0:11 /opt/CSCOpx/objects/perl/bin/perl /opt/CSCOpx/bin/restorebackup.pl -d /product/
-rw-r--r-- 1 root other 2693888 May 8 14:37 restorebackup.log.old
-rw-rw---- 1 root casusers 68234 May 8 15:19 CCRImport.log
-rw-r----- 1 casuser casusers 36059 May 8 15:23 ess.log
-rw-r----- 1 casuser casusers 981749 May 8 15:23 SyslogCollector.log
-rw-rw---- 1 casuser casusers 131257 May 8 15:25 cri.log
-rw-r----- 1 casuser casusers 4874 May 8 15:26 stpeng.log
-rw-rw---- 1 casuser casusers 1740435 May 8 15:26 EssentialsDM_Server.log
-rw-rw---- 1 casuser casusers 2580742 May 8 15:26 dcmaservice.log
-rw-rw---- 1 casuser casusers 15486 May 8 15:26 ca.log
-rw-r----- 1 casuser casusers 9572 May 8 15:27 CMFOGSServer.log
-rw-rw---- 1 casuser casusers 25736 May 8 15:27 AnalyzerDebug.log
-rw-r----- 1 casuser casusers 49719 May 8 15:27 lwms.log
-rw-rw---- 1 casuser casusers 5419488 May 8 15:29 NetShowServer.log
-rw-r--r-- 1 root other 13007803 May 8 15:31 migration.log
-rw-r--r-- 1 casuser casusers 1755759 May 8 15:31 restorebackup.log
-rw-r----- 1 casuser casusers 67085 May 8 15:36 CampusOGSServer.log
-rw-rw---- 1 casuser casusers 1730657 May 8 15:36 pidm.log
-rw-rw---- 1 casuser casusers 446547 May 8 15:36 license.log
-rw-r----- 1 casuser casusers 3002 May 8 16:02 RMEDbMonitor.log
-rw-r----- 1 casuser casusers 3002 May 8 16:02 CmfDbMonitor.log
-rw-rw---- 1 casuser casusers 35518 May 8 16:02 CTMJrmServer.log
-rw-rw---- 1 casuser casusers 404179 May 8 16:02 NCTemplateMgr.log
-rw-rw---- 1 root casusers 31270 May 8 16:02 rme_base.log
-rw-rw---- 1 root casusers 8124634 May 8 16:02 rme_ctm.log
-rw-rw---- 1 casuser casusers 5440687 May 8 16:02 IC_Server.log
-rw-rw---- 1 casuser casusers 1686819 May 8 16:03 dmgtd.log
-rw-rw---- 1 casuser casusers 75698 May 8 16:03 daemons.log
-rw-rw---- 1 casuser casusers 167922 May 8 16:03 RMECSTMServerlog
-rw-rw---- 1 casuser casusers 2321 May 8 18:41 diskWatcher.log
-rw-rw---- 1 casuser casusers 31417435 May 8 18:56 dmgtDbg.log
05-08-2007 04:43 PM
These patches should not be causing this problem. Assuming the migration.log looks the same as before, the migration is complete, so things should start happening in the restore log. In that case, please get the output of:
lsof -p 28868
And the /tmp/restore.truss output of the following command (let it run for a minute, then do control+C):
truss -a -f -vall -rall -wall -o /tmp/restore.truss -p 28868
05-09-2007 05:28 AM
Please find the outputs in the two new attachments to TAC case 605960929. Thanks.
lsof-p.txt
restore.truss
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide