Red Hat Enterprise Linux Troubleshooting Guide : Identify, Capture and Resolve Common Issues Faced by Red Hat Enterprise Linux Administrators Using Best Practices and Advanced Troubleshooting Techniques.
Material type:
- text
- computer
- online resource
- 9781785287879
- QA76.76.O63 -- C36 2015eb
Cover -- Copyright -- Credits -- About the Author -- About the Reviewers -- www.PacktPub.com -- Table of Contents -- Preface -- Chapter 1: Troubleshooting Best Practices -- Styles of troubleshooting -- The Data Collector -- The Educated Guesser -- The Adaptor -- Choosing the appropriate style -- Troubleshooting steps -- Understanding the problem statement -- Asking questions -- Attempting to duplicate the issue -- Running investigatory commands -- Establishing a hypothesis -- Putting together patterns -- Is this something that I've encountered before? -- Trial and error -- Start by creating a backup -- Getting help -- Books -- Team Wikis or Runbooks -- Google -- Man pages -- Red Hat kernel docs -- People -- Documentation -- Root cause analysis -- The anatomy of a good RCA -- The problem as it was reported -- The actual root cause of the problem -- A timeline of events and actions taken -- Any key data points to validate the root cause -- A plan of action to prevent the incident from reoccurring -- Establishing a root cause -- Sometimes you must sacrifice a root cause analysis -- Understanding your environment -- Summary -- Chapter 2: Troubleshooting Commands and Sources of Useful Information -- Finding useful information -- Log files -- The default location -- Common log files -- Finding logs that are not in the default location -- Configuration files -- Default system configuration directory -- Finding configuration files -- The proc filesystem -- Troubleshooting commands -- Command-line basics -- Command flags -- The piping command output -- Gathering general information -- w - show who is logged on and what they are doing -- rpm - RPM package manager -- df - report file system space usage -- free - display memory utilization -- ps - report a snapshot of current running processes -- Networking -- ip - show and manipulate network settings.
netstat - network statistics -- Performance -- iotop - a simple top-like I/O monitor -- iostat - report I/O and CPU statistics -- vmstat - report virtual memory statistics -- sar - collect, report, or save system activity information -- Summary -- Chapter 3: Troubleshooting a Web Application -- A small back story -- The reported issue -- Data gathering -- Asking questions -- Duplicating the issue -- Understanding the environment -- Where is this blog hosted? -- Ok, it's within our environment -- now what? -- What services are installed and running? -- Looking for error messages -- Apache logs -- Verifying the database -- Verifying the WordPress database -- Establishing a hypothesis -- Resolving the issue -- Understanding database data files -- Finding the MariaDB data folder -- Resolving data file issues -- Validating -- Final validation -- Summary -- Chapter 4: Troubleshooting Performance Issues -- Performance issues -- It's slow -- Performance -- Application -- CPU -- Top - a single command to look at everything -- Determining the number of CPUs available -- ps - Drill down deeper on individual processes with ps -- Putting it all together -- A quick look with top -- Memory -- free - Looking at free and used memory -- Checking for oomkill -- ps - Checking individual processes memory utilization -- vmstat - Monitoring memory allocation and swapping -- Putting it all together -- Disk -- iostat - CPU and device input/output statistics -- Who is writing to these devices? -- iotop - A top top-like command for disk i/o -- Putting it all together -- Network -- ifstat - Review interface statistics -- Quick review of what we have identified -- Comparing historical metrics -- sar - System activity report -- CPU -- Memory -- Disk -- Network -- Review what we learned by comparing historical statistics -- Summary -- Chapter 5: Network Troubleshooting.
Database connectivity issues -- Data collection -- Duplicating the issue -- Finding the database server -- Testing connectivity -- Telnet from blog.example.com -- Telnet from our laptop -- Ping -- Troubleshooting DNS -- Checking DNS with dig -- Looking up DNS with nslookup -- What did dig and nslookup tell us? -- DNS summary -- Pinging from another location -- Testing port connectivity with cURL -- Showing current network connections with netstat -- Using netstat to watch for new connections -- Breakdown of netstat states -- Capturing network traffic with tcpdump -- Taking a look at the server's network interfaces -- Specifying the interface with tcpdump -- Reading the captured data -- A quick primer on TCP -- Reviewing collected data -- Taking a look on the other side -- Identifying the network configuration -- Testing connectivity from db.example.com -- Looking for connections with netstat -- Tracing network connections with tcpdump -- Routing -- Viewing the routing table -- Utilizing IP to show the routing table -- Looking for routing misconfigurations -- Hypothesis -- Trial and error -- Removing the invalid route -- Configuration files -- Summary -- Chapter 6: Diagnosing and Correcting Firewall Issues -- Diagnosing firewalls -- Déjà vu -- Troubleshooting from historic issues -- Basic troubleshooting -- Validating the MariaDB service -- Troubleshooting with tcpdump -- Understanding ICMP -- Understanding connection rejections -- A quick summary of what you have learned so far -- Managing the Linux firewall with iptables -- Verify that iptables is running -- Show iptables rules being enforced -- Understanding iptables rules -- Ordering matters -- Default policies -- Breaking down the iptables rules -- Putting the rules together -- Viewing iptables counters -- Correcting the iptables rule ordering -- Summary.
Chapter 7: Filesystem Errors and Recovery -- Diagnosing filesystem errors -- Read-only filesystems -- Using the mount command to list mounted filesystems -- A mounted filesystem -- Using fdisk to list available partitions -- Back to troubleshooting -- NFS - Network Filesystem -- NFS and network connectivity -- Using the showmount command -- NFS server configuration -- Exploring /etc/exports -- Identifying the current exports -- Testing NFS from another client -- Making mounts permanent -- Unmounting the /mnt filesystem -- Troubleshooting the NFS server, again -- Finding the NFS log messages -- Reading /var/log/messages -- Read-only filesystems -- Identifying disk issues -- Recovering the filesystem -- Unmounting the filesystem -- Filesystem checks with fsck -- The fsck and xfs filesystems -- How do these tools repair a filesystem? -- Mounting the filesystem -- Repairing the other filesystems -- Recovering the / (root) filesystem -- Validation -- Summary -- Chapter 8: Hardware Troubleshooting -- Starting with a log entry -- What is a RAID? -- RAID 0 - striping -- RAID 1 - mirroring -- RAID 5 - striping with distributed parity -- RAID 6 - striping with double distributed parity -- RAID 10 - mirrored and striped -- Back to troubleshooting our RAID -- How RAID recovery works -- Checking the current RAID status -- Summarizing the key information -- Looking at md status with /proc/mdstat -- Using both /proc/mdstat and mdadm -- Identifying a bigger issue -- Understanding /dev -- More than just disk drives -- Device messages with dmesg -- Summarizing what dmesg has provided -- Using mdadm to examine the superblock -- Checking /dev/sdb2 -- What we have learned so far -- Re-adding the drives to the arrays -- Adding a new disk device -- When disks are not added cleanly -- Another way to watch the rebuild status -- Summary.
Chapter 9: Using System Tools to Troubleshoot Applications -- Open source versus home-grown applications -- When the application won't start -- Exit codes -- Is the script failing, or the application? -- A wealth of information in the configuration file -- Watching log files during startup -- Checking whether the application is already running -- Checking open files -- Understanding file descriptors -- Getting back to the lsof output -- Using lsof to check whether we have a previously running process -- Finding out more about the application -- Tracing an application with strace -- What is a system call? -- Using strace to identify why the application will not start -- Resolving the conflict -- Summary -- Chapter 10: Understanding Linux User and Kernel Limits -- A reported issue -- Why is the job failing? -- Background questions -- Is the cron job even running? -- User crontabs -- Understanding user limits -- The file size limit -- The max user processes limit -- The open files limit -- Changing user limits -- The limits.conf file -- Future proofing the scheduled job -- Running the job again -- Kernel tunables -- Finding the kernel parameter for open files -- Changing kernel tunables -- Permanently changing a tunable -- Temporarily changing a tunable -- Running the job one last time -- A look back -- Too many open files -- A bit of clean up -- Summary -- Chapter 11: Recovering from Common Failures -- The reported problem -- Is Apache really down? -- Why is it down? -- What else was happening at that time? -- Searching the messages log -- Breaking down this useful one-liner -- The uniq command -- Tying it all together -- What happens when a Linux system runs out of memory? -- Minimum free memory -- How oom-kill works -- Determining whether our process was killed by oom-kill -- Why did the system run out of memory?.
Resolving the issue in the long-term and short-term.
Description based on publisher supplied metadata and other sources.
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
There are no comments on this title.