This blog relates to experiences in the Systems Capacity and Availability areas, focusing on statistical filtering and pattern recognition and BI analysis and reporting techniques (SPC, APC, MASF, 6-SIGMA, SEDS/SETDS and other)
I have just got an internal feedback which has impressed me so I have decided to share
that here as a good indication of making difference by Capacity Management service I provide to my customers:
..."Igor has helped us on a
couple of instances,prevent potential disasterswithin
our platform by the historical trends he’s tracking.We know
this since there was one occasion where our server crashed when we didn't react fast enough to Igor’s email warnings..."
Conference rooms and office stand-ups are all buzzing. Artificial intelligence (AI) is emerging to represent one of the largest technological shifts across countless industries in recent history.
From driving manufacturing to supporting marketing to improving customer retention to improving ITops, AI is become the most in-demand technology of IT execs and quickly rising to the top of the list for IT investment in 2019.
According to recent reports, the machine learning market alone is anticipated to grow from $1.4B in 2017 to $8.8B by 2022.
CMG wants to help you to navigate this technological shift and keep you the smartest team member at the table. On November 27th, join CMG and its partners for AIXCHANGE. The latest of CMG’s virtual conference program will feature live presentations from companies and individuals leading in the AI space.
10:00 AM - The History and Future of AI with Bryan Krouse
11:00 AM - The Machines are Talking - The Future of AI and Chat for the Enterprise with Stephen Mallik
12:00 PM - A Practical Guide for Information Discovery Using Machine Learning and Visualization
1:00 PM - To Be Announced!
2:00 PM - Soon AI will Test Everything with Jason Arbon
"...Control charts are used during the Control phase of DMAIC
methodology. Control charts, also known as Shewhart charts or process-behavior
charts, are a statistical process control tool used to determine if a
manufacturing or business process is in a state of control. If analysis of the
control chart indicates that the process is currently under control, then no
corrections or changes to process control parameters are needed. Moreover, data
from the method can be used to predict the future performance of the
process. If the control chart indicates that the process is not in
control, analysis of the chart can help determine the sources of variation, as
this will result in degradation of process performance..."
When I had been developing SEDS (Performance Anomaly Detection System) long
ago (years ago) I looked at that package (and referenced the link to my early
CMG papers, BTW) even not knowing how to write R programs (now I can!)... They
might improved that, but I did not find that time the way to do MASF type of
control charts. I have even dreamed to build SETDS charts (IT-Control
Charts) package on a open source way. So my approach is the same but
different, please read details in my paper: https://www.researchgate.net/publication/259486289_IT-Control_Chart
IMPACT 2019 will be an action-packed, 3-day conference filled with information and collaboration. Register today for #IMPACT2019 and take advantage of $100 off your conference pass with code FOB2019: https://cmgimpact.com/ #cmgnews
Act today and take $100 off your registration fee with code “FOB2019”. To find out more about content, sessions, and activities and what makes CMG’s IMPACT the best #technology conference on the planet, click here: https://cmgimpact.com/#cmgnews #IMPACT2019
Join hundreds of #industryleaders for CMG's 44th International Conference! #IMPACT2019 promises to be an exciting conference with great learning and networking opportunities. Discount code (ASK ME!) available to save $100 off a conference pass. Act today and save! https://cmgimpact.com/#cmgnews
IMPACT 2019 #Conference sessions will educate and enlighten, enabling attendees to take a #leadership role in their own companies’ #digitaltransformations. Register today for #IMPACT2019 and take advantage of $100 off your conference pass with “FOB2019”: https://cmgimpact.com/ #cmgnews
Computer Management Group (www.CMG.org), one of the worlds most influential organizations of IT professionals committed to digital transformation initiatives and best practices, is delighted to announce the launch of its new brand and communications platform—a platform built to showcase its measurement and management of computer systems and networks from a performance and capacity ... Read More »
I have been meaning to reach out to you to let you know about what we had done, technically the product is still in Beta but it seems the marketing team have pushed forward with making it generally available.
We found your work of great value as we looked through various methodologies for trending, in the end we implemented something based on SEDS with a few changes/additions, to be honest I would have to review the code to see what the differences are.
At the moment the product opTrend is working well enough, but we need to make some refinements and enhancements before it will be the first release.
Your research and publications are of great value and highly appreciated."
"...These solutions were then complimented by the addition of opTrend, which expands on Opmantek’s already expansive thresholding and alerting system by implementing a highly flexible Statistical Exception Detection System (SEDS)that learns what’s normal behavior on the client’s network and adjusts thresholding dynamically based on historical usage for every hour of each day of the week..." The description is limited, but apparently it is my SEDS method (MASF based Anomaly Detection) published in several white papers and blog posts. I am happy except there is no reference to my name, papers or at least this blog.
Part 1. The Neural Network (NN) is not a new machine learning method. About 12 years ago I was involved as a Capacity Planning resource for the project of building an infrastructure (servers) to run NN for the fraud detection application. Now NN got much more attention and popularity as a part of AI, mostly because the computing power is increased dramatically and respectively more tasks can be done by using NN.
The goal of the presentation is to demystify the technique in some simple terms and examples to show what it actually is and how that could be used for Capacity and Demand management. That is done by developing R code to recognize typical workload pasterns, like OLTP, or others in the time series performance data daily profiles.
Part 2. It is the typical concern to detect anomalies for short living objects or for the object with very small amount of measurements. Why? Number of those objects could be thousands and thousands so it is important to separate exceptional ones with anomalies for further investigation. That could be servers or customers that have just started being monitored or public cloud objects (EC2s, ASGs) that usually have very short lifespan. Suggested approach to detect anomalous behavior of this type of objects is to estimate the Entropy of the each object. If the entropy is low, everything should be in order and most likely OK. If not - there is a possible disorder there or mess and someone needs to check what is going on with the object. The method is implemented in the cloud based application written on R that scans every hour all cloud Auto Scaling Groups (ASG) to detect imbalanced ones in term of number of EC2 instances in the group. That allows to separate a couple hundreds ASGs out of hundreds thousands of them.
1. "Preparing Today’s Infrastructure for Tomorrow’s
Workloads", Kofi Hayford, DataCore
2. "Flash, Flash and more Flash: What’s
in it for me?", John Baker from Data Kinetics
ABSTRACT: Flash. Is this the light on my
camera, a plugin for my browser, or a superhero that runs really fast? If
you’re in IT, flash likely refers to something else: flash memory of some sort.
But even then, the term is both ubiquitous and vague. The solid-state drive in
your laptop; the new memory in your z14; your new “all flash” storage array.
These all boast “flash” technology that promises to make everything faster. Flash technology is not new but reductions in
price and advancements in capacity are exploding adoption. There are many
compelling arguments for flash – but is it the solution to all storage
bottlenecks? Take a stroll with John as we slow down the hype to explore the
various implementations of flash technology and which, if any, can best speed
up your datacenter. And don’t worry; I won’t be wearing a red, spandex suit.
3. "Dave’s not here: R4HA, AWLC, SCRT and
other 4-letter software pricing words", John Baker from Data Kinetics
ABSTRACT: David Chase has left the building.
Which leaves many of us scratching our heads about the whirlwind of available
pricing options. IBM Z continues to improve efficiency and options like
Country-Wide Multiplex (CMP), Mobile and Container pricing, and soft capping
options, improve ROI but how does all this stuff work? The R4HA remains key but
even that requires context. Its measured on each LPAR but often combined
depending on what, and where, various subsystems are running. Which value,
average, or total determines your software bill? And what if some LPARs are
capped? This session is for everyone. Performance, Capacity, or Finance. Costs
are driving more and more of our decisions. Lets start by understanding how
these costs are determined.
PRESENTER: John Baker has over 25 years in the
IT industry as both a customer and consultant. For the last 20 years, John has
focused on mainframe performance and cost optimization. As a customer, John
designed, implemented and maintained many critical projects such as WLM Goal
Mode, GDPS/Data Mirroring, and merging datacenters.
Light LUNCH is provided by Compuware
4. Compuware presentations by Kelly Vogt:
- "The Practical and Wholesale
Application of Strobe to Reduce Cost, Improve Scalability and
ABSTRACT: Tuning is often a reactive activity.
A crisis presents and we turn out in force to stamp out the fire; then return
to the myriad other things we have to do… But what if we flip that around and
make it a proactive activity for fun and profit? See how a common sense
approach to proactive treasure hunting in your system can save large amounts of
money in your installation and improve service for your customers.
- "The Necessity of, and the Value
Proposition for, Total Batch Automation with ThruPut Manager"
ABSTRACT: What is the future of your batch
strategy? Is it nimble? Fluid? For all the automation brought to the mainframe,
why is it that batch is still manually operated? All that tracking jobs and
endless commands fiddling about with initiators and job classes. Learn why we
cannot continue on like we are… and what we can do about it.
PRESENTER: Kelly Vogt joined Compuware in
February as an Field Tech in support of ThruPut Manager. He has 38 years in the
mainframe arena, with 24 years in systems programming and performance
management. His last 14 years were spent in management leading Large Systems
Engineering for Humana. He has extensive experience buying and renewing IBM and
ISV hardware/software; knowledge of software license models and contracts;
performance and capacity planning experience; and the day in, day out of data
What is Capacity Management? [Webinar Recap]: Capacity management is the practice of making sure IT resources meet business demands today and down the road—without over-provisioning. But the role of capacity management has changed as IT environments have evolved.
"Machine Learning for Predictive Performance Monitoring",
which is available for CMG members
I have enjoyed reading the paper, below is the abstract:
I like especially his following very true saying:
"...Machines don’t actually “learn” nor do statistical algorithms represent some mechanistic disembodied intelligence. However, human learning and intelligence is greatly assisted by statistical modeling in much the same way that optics technology assists vision..."
I appreciate he referenced two my CMG papers under his "Useful Related Materials" section:
Reading "Anomaly detection with Apache MXNet":
"An important distinction has to be made between anomaly detection and “novelty detection.” The latter turns up new, previously unobserved, events that still are acceptable and expected. For example, at some point in time, your credit card statements might start showing baby products, which you’ve never before purchased. Those are new observations not found in the training data, but given the normal changes in consumers’ lives, may be acceptable purchases that should not be marked as anomalies."
I figured out that my SETDS method has this Novelty Detection included as my
EV based trends detectionmethod (e.g. implemented in R as "TrendieR") finds recent change points in the time-serious data and then by building trend-forecast checks if the change is permanent or not. So if it is permanent the possible "novelty" is detected.
So the 1st part of SETDS (e.g. implemented as "SonR" on R) captures just anomalies and/or outliers, then Trend detection separates cases that indicate the possible "novelty". (something changed and stays changed and growing). Still false positive could be there though....
BTW there is a 3rd level of SETDS which is actually the way to correlate performance data with demand (drivers) data to build meaningful forecasts (e.g. implemented as "Model Factory")