I'll be taking over administration of a rack of Solaris machines that haven't had an admin for the last 9 months. Prior to that they had limited maintenance. I understand there are a few tickets that will need addressed, but I won't have the details for a few days on them. Regardless, I'm trying to compile a to-do list. What would you add to this list?
1, Check for hardware failures, disks, fans, psus, etc... repair as needed
2, Ensure backups are being taken and are restorable
3, Check all installed packages for exploits, update as needed
4, Check who has permissions to access these servers, internally and externally. Verify they all should have access.
5, Setup monitoring providing me with immediate access of issues.
6, Acquire Oracle Support agreement details so if\when I need them, I have ready access.
What would you add\change\remove on this list? Thanks in advance for your help.
Edited, Thanks arbrante
Hmm, check the hardware so that you don't have servers running with broken disks or other broken components. Its fairly common for unmaintained systems to have broken disks, which isn't noticable as long as the remaining disk/part doesn't fail..
Consider grabbing a snapshot of mounted filesystems, runnng services, and running processes.
... If the machine gets rebooted (either planned or unplanned) you may have an indicator it has not reached the expected state.
... Try to identify anything that has to be manually restarted on reboot.
You cross-posted this to other forum web sites
... and didn't have the manners to mention that fact so that people wouldn't have to spend time formulating a response that duplicates what you've already been told in those other forums.
This thread is locked.