Popular Post

_

Tuesday, September 21, 2021

Got my 1st #AWScertification

 View my verified achievement from Amazon Web Services (AWS).

AWS Certified Cloud Practitioner was issued by Amazon Web Services Training and Certification to Igor Trubin

Friday, July 30, 2021

"Performance #Anomaly and #ChangePointDetection For Large-Scale System Management" for WorldS4 2021 - my presentation slides deck is available on RG

 I have successfully made my presentation at WorldS4 conference. Presentation deck is available HERE













Friday, July 23, 2021

I'm excited to present my paper "Performance #Anomaly and Change Point Detection for Large-Scale System Management" at 5th World Conference on Smart Trends in Systems, Security and Sustainability

See that in the agenda: https://sched.co/lEkJ

 


Tuesday, July 20, 2021

Presenting in London - "Performance Anomaly and Change Point Detection For Large-Scale System Management"

I will be presenting at the Worlds3 conference in London my paper

"Performance Anomaly and Change Point Detection For Large-Scale System Management"

(https://www.researchgate.net/publication/340926055_Performance_Anomaly_and_Change_Point_Detection_For_Large-Scale_System_Management)

Time slot in London time: 04:30 - 06:00 on 29th July 2021



Friday, June 11, 2021

Cloud Capacity Management Explained by CMG.org - #cmgnews

CMG publications about Cloud Capacity Management (some links accessible only for CMG members)

  1. Cloud Capacity Management (PDF doc from Metron-Athene)

  2. 8 Things You Need to Know About Capacity Planning for the Cloud (helpsystem)

  3. How to Do Capacity Management in the Cloud (helpsystem).

  4. (in UT) How to do Capacity Management in the Cloud Text is HERE (TeamQuest)  

  1. Building and Rebuilding a Data Center Every Day  (Netflix)

  1. Netflix Performance Tales in One Take

  2. Cloud Cost Optimization at Spotify

  1. Lifting the Cloud of Obscurity from your Cloud Deployment

  2. The new dimensions of cloud – IT infrastructure resource planning, optimization and cost BMC Software

  3. Redefining Enterprise Cloud Transformations: How Fidelity Investments is establishing a new foundation for Observability and Reliability (is coming)

  1. Cloud Capacity Management  by Kevin McLaughlin (Capital One)

  2. Under cloudy skies capacity planning in changing times – Brian Wong, Capital One

  3. Optimizing your Cloud – Igor Trubin, IT Manager, Capital One

Monday, June 7, 2021

How am I doing? LinkedIn recommendations (this year)

 




Thursday, March 25, 2021

SEDS based "CLOUD RESOURCES WORKLOAD PROFILING"

Based on SEDS method the workload profiling of main cloud objects (AWS EC2, RDSand  EBS) are implemented at my current work. 

Next Tuesday 3/30 at 12:30 pm EST I will be sharing my experience of building and using this method at the Data Centers and Cloud Infrastructure virtual CMG.org conference. You are welcome! The topic of the presentation is "CLOUD RESOURCES WORKLOAD PROFILING"

ABSTRACT: How to be sure a cloud object’s (e.g, AWS EC2, RDS or EBS) workload fits the rightsized resources (Compute, RAM, IO/s and Network traffic)? It is very difficult to do using raw system performance data from monitoring tools. The best way to do that is using a weekly workload profile, which is a graphical visualization in form of MASF IT-Control chart. This chart shows the stability of the workload, reveals the anomalies that happened recently, such as run-away, memory leaks or specifically important for cloud objects, the unusual number of hours the object is down all compared with the usual weekly pattern.

This presentation will describe how to build, read, and use workload profiles using real data examples and demonstrates how cloud capacity scaling could be verified.



Thursday, January 21, 2021

"Performance problem diagnosis in cloud infrastructures" (#CloudComputing #AnomalyDetection #ControlChart)

  I found this interesting research (2016) written by Olumuyiwa Ibidunmoye, which has a reference to my 2004 paper (Capturing Workload Pathology by Statistical Exception)

Abstract 

Cloud datacenters comprise hundreds or thousands of disparate application services, each having stringent performance and availability requirements, sharing a finite set of heterogeneous hardware and software resources. The implication of such complex environment is that the occurrence of performance problems, such as slow application response and unplanned downtimes, has become a norm rather than exception resulting in decreased revenue, damaged reputation, and huge human-effort in diagnosis. Though causes can be as varied as application issues (e.g. bugs), machine-level failures (e.g. faulty server), and operator errors (e.g. mis-configurations), recent studies have attributed capacity-related issues, such as resource shortage and contention, as the cause of most performance problems on the Internet today. As cloud datacenters become increasingly autonomous there is need for automated performance diagnosis systems that can adapt their operation to reflect the changing workload and topology in the infrastructure. In particular, such systems should be able to detect anomalous performance events, uncover manifestations of capacity bottlenecks, localize actual root-cause(s), and possibly suggest or actuate corrections. This thesis investigates approaches for diagnosing performance problems in cloud infrastructures. We present the outcome of an extensive survey of existing research contributions addressing performance diagnosis in diverse systems domains. We also present models and algorithms for detecting anomalies in real-time application performance and identification of anomalous datacenter resources based on operational metrics and spatial dependency across datacenter components. Empirical evaluations of our approaches shows how they can be used to improve end-user experience, service assurance and support root-cause analysis.

Control Charting Example from the paper



Anomalies Detection and Cloud Platform Selection During DevOps - #CMGimpact conference interesting session (#CMGnews #CloudComputing #AnomalyDetection)

The topic is interesting as it relates to two my current interests: Clouding and AD. It is scheduled for today evening - https://cmgimpact.com/anomalies-detection-and-cloud-platform-selection-during-devops/ 

Here is abstract: 

"In this session, the presenters will review the challenges of anomaly detection during DevOps and discuss the methodology and use case of cloud platform selection for the application. There will be a focus on applying iterative modeling and gradient optimization to determine the minimum configuration and cost required to support new applications Service Level Goals in different clouds"






Wednesday, January 20, 2021

Enjoying Virtual CMG Impact conference. #CMGnews

 


- this is  a snip shot from last night's Q&As with Chris Molloy! 
IMPACT 2021 is great place for getting answers & networking among your peers! Come join us and take advantage of what IMPACT 2021 has to offer, click here https://hubs.li/H0F87V00