Detecting and Diagnosing Problems When z/OS Thinks It Is OK (includes PFA User Experience)
				Project and Program: 
MVS, 
MVS Core Technologies
				Tags: 
Proceedings, 
2012, 
SHARE in Atlanta 2012
		
		
		
			
		The presenter will discuss the multiple capabilities which are available on z/OS to detect and diagnose soft failures
- Describe soft failure detection 
 
 
- Built into z/OS component like XCF stalled member detection 
 
- Provided by health checks
 
- Provided by z/OS PFA
 
- Provided by other vendor products
 
 
 
- Highlight the kind of problems each different type of soft failure detection is good at and not good at 
 
 
- Machine time scale vs human time scale
 
- Location in the stack
 
- Detectable by performance metrics vs non performance metrics
 
 
 
- Insight from building PFA to help reduce impact of soft failures 
 
 
- Automation of alerts is key
 
- z/OS can survive / recover from most soft failures
 
- Most metrics are very time sensitive 
 
 
 
Robert Abrams, IBM Corporation; Sam Knutson, GEICO
		
		
		
		
		
		
	
 Back to Proceedings File Library