on
11-25-2024
07:47 AM
- edited on
12-05-2024
03:55 PM
by
Tyler Langston
ThousandEyes is a powerful SaaS platform that provides a digital picture of enterprise infrastructure, formed by test views, alerts, dashboards, and other components. In this article, we will discuss the lifecycle of ThousandEyes Alerts, including the processes of creating, configuring, triggering, and clearing alerts.
ThousandEyes Alerts are a critical component of the ThousandEyes platform. These messages notify users of performance deviations or problems in the networks and applications they monitor. Strictly speaking, a ThousandEyes Alert indicates that one or more tests have exceeded a defined error threshold. Alerts are configured using alert rules, which specify the conditions under which an alert will be triggered. More information about ThousandEyes Alerts can be found in our official ThousandEyes documentation.
A list of the steps we’ll review in this article are below - feel free to skip to the one most relevant to your situation, or follow it from Step 1 through Step 8 below to become an expert in all things Alerts!
1. Create the Alert Rule. In this step we’ll create and define a new alert rule named "Community Nov 24 Alert" focused on packet loss. This step includes setting the conditions that will trigger the alert, such as a specific percentage of packet loss over a defined duration.
2. Create the Test. Learn how to set up a new test called "Community Nov 24 Test." This test should target the desired destination and be configured to monitor metrics relevant to your alert rule, such as packet loss.
3. Verify Test Execution. In order to test an alert we need a way to create a ‘fake’ reason for the alert. We’ll cover denying ICMP traffic to the destination by configuring firewall rules or network settings. Once done, you’ll be able to verify that ping requests to the host are failing, simulating packet loss.
4. Simulate Packet Loss. Deny ICMP traffic to the destination by configuring firewall rules or network settings. Verify that ping requests to the host are failing, simulating packet loss.
5. Recheck Test Results. With our test-traffic (or lack thereof) ready to go, next we’ll confirm that the test is still running and accurately reflecting the simulated packet loss in its results.
6. Verify Alert Triggering. With the test working properly, it’s time to check that the alert is triggered due to the packet loss condition. We’ll go through the process of checking ThousandEyes for active alerts and confirm receipt of an email notification indicating the alert has been triggered.
7. Restore ICMP Traffic. Now that it’s been successfully triggered, our last few steps will review allowing ICMP traffic again by reverting changes made in step 4. This should allow you to verify that ping requests are successful and that the test results indicate normal operation (green status).
8. Confirm Alert Clearance. With service now ‘restored’ our alert should automatically clear. We’ll go through that process together, making sure that the alert history in ThousandEyes has been appropriately archived and an email notification confirming the alert’s resolution is sent.
Let's get started!
To create a new alert rule, navigate to Alerts > Alert Rules and click on "Add New Alert Rule:
From the ‘Add New Alert Rule’ screen, fill in the appropriate information. For the purpose of our article we will fill out the information as follows:
[1] Alert type: Network – Agent to Server
[2] Rule name: name your rule. In our example, this is "Community Nov 2024 Alert".
[3] Test: leave it unchecked for now, we will link the Alert to the test in Step 2.
[4] Agents: you can leave the default value here.
[5] Severity: you can leave the default (info) value here
[6] Alert detection: manual. With that option, we can explicitly define the triggering conditions.
[7,8] Alert conditions. In our example, it says: if at least 1 agent 1 time is facing packet loss greater than 10% - to trigger the Alert “: Community Nov 2024 Test”
[9] Create New Alert Rule. Finalize the new Alert creation by clicking this button.
In this article, we will be using ThousandEyes Network Agent-to-Server tests to monitor a target using ICMP probes at 1-minute intervals.
To create this test, navigate to Cloud & Enterprise Agents > Test Settings > Create a single test:
Within the ‘Start with a single test’ screen, fill in the appropriate information. For the purpose of our article we filled out the information as follows:
[1] Layer: Network.
[2] Test type: Agent to Server.
[3] Test name: Community Nov 2024 Test.
[4] Test Description: Optional field; enter a description of the test here.
[5] Target: Enter the FQDN or IP address of the Linux machine created specifically for this test.
Note: Full control of this machine will allow us to configure local iptables to drop incoming ICMP requests later.
[6] Protocol: ICMP.
[7] Interval: 1 minute.
[8] Agents: Select the agent(s) the test should run on. For this article, we're choosing just one agent.
[9] Alerts: Enable the checkbox and select the Alert rule we created in Step 1 (Community Nov 2024 Alert).
[10] Create New Test. Complete the test creation by clicking this button.
Now that we’ve created the test and the alert, it’s time to make sure it’s working!
To get started, open the test view by clicking the icon to the right of the Test name:
Once clicked, we can observe the test running and that the target is responding to ICMP requests:
With our test configured and verified working, it’s time to break things! As we have full control over the target server, we can easily manipulate traffic rules to ‘fake’ some packet loss for our test to uncover and generate an alert. The target is a Linux machine, so we simply need to configure the local firewall to drop ICMP traffic with the following snippet:
iptables -I INPUT 1 -p icmp --icmp-type echo-request -j DROP
this rule says:
Before applying the rule, verify the target is responding to ICMP requests. You can do this by pinging the target from a terminal window:
And applying the iptables rule from above:
Once input, we should observe timeouts:
Now that we know it’s ‘broken’ we should expect to see changes in the test view, by going back to ThousandEyes and rerunning the test we set up in Step 2.
Using our example we observed 100% packet loss, indicating that the target is no longer responding to ICMP requests due to the applied iptables rule:
With our test confirming that something is ‘broken’, it’s time to check to see that we receive the alert, since the conditions we stipulated when we created the alert in Step 1 have been met.
To do this, navigate back to the Active Alerts tab.
Here we can see our alert rule triggered as anticipated::
Additionally, if a notification email has been specified as part of its configuration the designated recipient should receive an email notification regarding the triggered Alert:
Here is what the email notification looks like when triggered and sent:
Now that we’ve tested the Alert to ensure that it works as we expect it to, it’s time to ‘unbreak’ our traffic to allow pings to reach the destination again. This is relatively simple, we just need to remove the firewall rule that is blocking ICMP traffic. This can be done using the following command:
iptables -D INPUT 1
The command deletes the first rule in the INPUT chain and should immediately restore communication that was previously blocked:
Let's verify that the test has returned to a stable status:
With our ‘outage’ successfully resolved, let's verify the Alert gets the message and changes status as well. Once resolved, an Alert should move from Active Alerts to Alert History, indicating that the Alert has stopped triggering:
In the ‘Alerts History’ section we can see the Alert Rule indicating the start time (Nov 25, 13:25:00 GMT +1) the Alert duration (29 minutes), and other information:
That is all! We have successfully navigated through the ThousandEyes Alert lifecycle.
If you are an existing Customer - you can always contact our expert engineers and get almost instant support using ThousandEyes chat.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: