ITOM Unlocked >>> Coffee Chat with ITOM Product Team: How do you manage a major incident?| Sep 2, 8:30 a.m. PDT
Hello good samaritans of the Freshworks Community,
I’m guessing that since it’s mid week, you all must be inundated with meetings, driving ambitious projects, and tackling deadlines left and right. The goal is likely within sight. And you are hopeful of meeting your quarterly OKRs. But there’s a lingering unease at the back of your mind. An unpredictable spanner in the works that might upset your plans. And it happens to be spelt as o-u-t-a-g-e.
Am I right or am I right?
So, take a break, grab a cup of coffee, sit us down, and share with us what amajor incidentmeans for your organization and how you respond to it.
Here are some questions to trigger your thoughts:
How do you identify a major incident?
How are end users informed?
How do your agents triage the incident and resolve it?
What part of the incident resolution process do you find the most painful?
If you are currently using Freshservice to manage major incidents, what did the process look like earlier? How has it changed after you started using Freshservice?
Share your experience and expertise with us and the Community. This willhelp us fine tune the content for the webinar, ‘ITOM Unlocked: Master Major Incident’ scheduled on September 5, 2024.
Together, we’ll identify best practices and brainstorm how Freshservice could be used to make your business increasingly resilient.
Few points to keep in mind for this coffee chat
Feeling inspired already? You don’t need to wait until Monday, September 2, ‘24 to share your inputs. Go ahead – we’re all ears!
On September 2, 2024 if you cannot make it at 8:30 a.m. PDT or even if you’re past time, you can still post your answer to a particular question.
You can reply to each others threads/posts to keep the conversation going. The ITOM Product team and I will be engaging with you all.
Please block your calendars for 45 mins on Sep 2, at 8:30 a.m. PDT and join the live thread discussion here with us.
See you all soon!!!!
Cheers
Anusha
Page 1 / 1
We previously had a MI process in Freshservice, largely revolving around a ‘MI’ tag that would get added to a particular incident, and trigger workflows that way. However, we have now embedded the new MI process…….and are yet to have a Major Incident to really test it. Both a good and a bad thing, I suppose!
I’m keen to hear what other tricks and tips people are using, but one we are in the process of setting up is the on call calendar. We don’t make use of it for any other team in Freshservice, just due to the nature of our set up, however the auto text and call features has allowed us to put our MI manager rota into Freshservice, and build an alert and escalate policy. Not technically part of the OTB MI process, but a nice alignment of functionality
Having recently been part of a MI and running through the PIR, I need to present the PIR to the service owners (Aftersales Director and Head of Service) and have a templated executive summary, the latest move to put the PIR into the reporting section is really useful!
BUT… when exporting the PIR the formatting is not set to any sort of accepted document size (i.e. A4) so when merging the PIR PDF with the Templated cover they size mismatch is jarring!
Not end of the world stuff but just looks sloppy.
It would be good to be able to use a templated cover in line with our company document standard.
Hello everyone!
Welcome to the very first Coffee Chat with the ITOM Product team! @adityalamghare@vidhisharma@a5huto5h and several others are looking forward to hear from you.
Let’s get started!
Cheers :)
Anusha
Having recently been part of a MI and running through the PIR, I need to present the PIR to the service owners (Aftersales Director and Head of Service) and have a templated executive summary, the latest move to put the PIR into the reporting section is really useful!
BUT… when exporting the PIR the formatting is not set to any sort of accepted document size (i.e. A4) so when merging the PIR PDF with the Templated cover they size mismatch is jarring!
Not end of the world stuff but just looks sloppy.
It would be good to be able to use a templated cover in line with our company document standard.
Hello, I understand the problem is originating because our editor as thereby our post incident reports are width that correspond to A4 size. We have plans to update that and bring its width to A4 size as well as add indicative line on the continuous document will visually help you understand where page breaks will be. We are targeting it for Dec. I am interested in knowing more about the templated cover. Is this some text that you paste at top of the report or something else?
We previously had a MI process in Freshservice, largely revolving around a ‘MI’ tag that would get added to a particular incident, and trigger workflows that way. However, we have now embedded the new MI process…….and are yet to have a Major Incident to really test it. Both a good and a bad thing, I suppose!
I’m keen to hear what other tricks and tips people are using, but one we are in the process of setting up is the on call calendar. We don’t make use of it for any other team in Freshservice, just due to the nature of our set up, however the auto text and call features has allowed us to put our MI manager rota into Freshservice, and build an alert and escalate policy. Not technically part of the OTB MI process, but a nice alignment of functionality
Before putting up your rotation roster in on-call calendar who was paging your teammates done in case of an MI? Was it done manually or you used some other tool?
We previously had a MI process in Freshservice, largely revolving around a ‘MI’ tag that would get added to a particular incident, and trigger workflows that way. However, we have now embedded the new MI process…….and are yet to have a Major Incident to really test it. Both a good and a bad thing, I suppose!
I’m keen to hear what other tricks and tips people are using, but one we are in the process of setting up is the on call calendar. We don’t make use of it for any other team in Freshservice, just due to the nature of our set up, however the auto text and call features has allowed us to put our MI manager rota into Freshservice, and build an alert and escalate policy. Not technically part of the OTB MI process, but a nice alignment of functionality
Before putting up your rotation roster in on-call calendar who was paging your teammates done in case of an MI? Was it done manually or you used some other tool?
It was done manually. Every Monday, that weeks Incident Manager would put a note in Teams, saying they were on rota, giving their contact number, and noting when someone was covering for them. Now it is all handled on the on call schedule, and if they are going to be unavailable, for example, if they are on an interview board, or in a VIP meeting, they can provide coverage directly from the schedule
We previously had a MI process in Freshservice, largely revolving around a ‘MI’ tag that would get added to a particular incident, and trigger workflows that way. However, we have now embedded the new MI process…….and are yet to have a Major Incident to really test it. Both a good and a bad thing, I suppose!
I’m keen to hear what other tricks and tips people are using, but one we are in the process of setting up is the on call calendar. We don’t make use of it for any other team in Freshservice, just due to the nature of our set up, however the auto text and call features has allowed us to put our MI manager rota into Freshservice, and build an alert and escalate policy. Not technically part of the OTB MI process, but a nice alignment of functionality
Hey @kenneth.anderson ,
Yes, fewer the number of MIs, the better it is for the business :) And since you’ve had such few MIs, I’m assuming that your incident management process is more proactive than reactive – which is great!
Could you share with us how have you implemented this proactiveness i.e. the ability to identify emerging incidents and act upon them before they become major incidents?
We previously had a MI process in Freshservice, largely revolving around a ‘MI’ tag that would get added to a particular incident, and trigger workflows that way. However, we have now embedded the new MI process…….and are yet to have a Major Incident to really test it. Both a good and a bad thing, I suppose!
I’m keen to hear what other tricks and tips people are using, but one we are in the process of setting up is the on call calendar. We don’t make use of it for any other team in Freshservice, just due to the nature of our set up, however the auto text and call features has allowed us to put our MI manager rota into Freshservice, and build an alert and escalate policy. Not technically part of the OTB MI process, but a nice alignment of functionality
Before putting up your rotation roster in on-call calendar who was paging your teammates done in case of an MI? Was it done manually or you used some other tool?
It was done manually. Every Monday, that weeks Incident Manager would put a note in Teams, saying they were on rota, giving their contact number, and noting when someone was covering for them. Now it is all handled on the on call schedule, and if they are going to be unavailable, for example, if they are on an interview board, or in a VIP meeting, they can provide coverage directly from the schedule
The manual process you had before sounds quite involved, but it's great to hear that the new setup is streamlining things for your team. Just curious if you face any friction moving your team from the old to new setup? Do you have any tips for the transition?
Having recently been part of a MI and running through the PIR, I need to present the PIR to the service owners (Aftersales Director and Head of Service) and have a templated executive summary, the latest move to put the PIR into the reporting section is really useful!
BUT… when exporting the PIR the formatting is not set to any sort of accepted document size (i.e. A4) so when merging the PIR PDF with the Templated cover they size mismatch is jarring!
Not end of the world stuff but just looks sloppy.
It would be good to be able to use a templated cover in line with our company document standard.
Hello, I understand the problem is originating because our editor as thereby our post incident reports are width that correspond to A4 size. We have plans to update that and bring its width to A4 size as well as add indicative line on the continuous document will visually help you understand where page breaks will be. We are targeting it for Dec. I am interested in knowing more about the templated cover. Is this some text that you paste at top of the report or something else?
Template would include:
Page 1: Company branded header page
Page 2: Executive summary
Page 3 onwards: Freshservice PIR document
Having recently been part of a MI and running through the PIR, I need to present the PIR to the service owners (Aftersales Director and Head of Service) and have a templated executive summary, the latest move to put the PIR into the reporting section is really useful!
BUT… when exporting the PIR the formatting is not set to any sort of accepted document size (i.e. A4) so when merging the PIR PDF with the Templated cover they size mismatch is jarring!
Not end of the world stuff but just looks sloppy.
It would be good to be able to use a templated cover in line with our company document standard.
Hello, I understand the problem is originating because our editor as thereby our post incident reports are width that correspond to A4 size. We have plans to update that and bring its width to A4 size as well as add indicative line on the continuous document will visually help you understand where page breaks will be. We are targeting it for Dec. I am interested in knowing more about the templated cover. Is this some text that you paste at top of the report or something else?
Template would include:
Page 1: Company branded header page
Page 2: Executive summary
Page 3 onwards: Freshservice PIR document
Does the branded header page include logos? I believe they most also be getting squeezed and stretched currently.
Having recently been part of a MI and running through the PIR, I need to present the PIR to the service owners (Aftersales Director and Head of Service) and have a templated executive summary, the latest move to put the PIR into the reporting section is really useful!
BUT… when exporting the PIR the formatting is not set to any sort of accepted document size (i.e. A4) so when merging the PIR PDF with the Templated cover they size mismatch is jarring!
Not end of the world stuff but just looks sloppy.
It would be good to be able to use a templated cover in line with our company document standard.
Hello, I understand the problem is originating because our editor as thereby our post incident reports are width that correspond to A4 size. We have plans to update that and bring its width to A4 size as well as add indicative line on the continuous document will visually help you understand where page breaks will be. We are targeting it for Dec. I am interested in knowing more about the templated cover. Is this some text that you paste at top of the report or something else?
Template would include:
Page 1: Company branded header page
Page 2: Executive summary
Page 3 onwards: Freshservice PIR document
Does the branded header page include logos? I believe they most also be getting squeezed and stretched currently.
I’ve just used 2 PDFs, exported and combined them. Easier than trying to format anything in Freshservice.
Thanks to everyone who joined our Coffee Chat
If you missed joining it live, no worries! Please leave your message in this thread and the ITOM Product Team will respond whenever they are next online.
Thanks @Anusha for brining back the Coffee chats on our Community! I’m tagging some of our other community members so that they can participate and share their thoughts too @raymondcanilao@eeha0120@Daniel Söderlund@DanielRuff@zachary.king@RonAnderson@mbutler@pOttenbacher@JustinL@HeatherM14 !
@alyssia.correa Sorry for the inconvenience. It’s always challenging to join across different time zones. I have already addressed a few issues directly with the product team.
Many of my concerns relate to major incidents, change management, and Statuspage, as we handle these as one “big topic.” However, since this coffee talk is focused solely on major incidents, I will limit my discussion to this topic.
We had an extensive discussion with Ashutosh from the Product Team of your Statuspage.
Business Rules:
There is currently no support for business rules when a major incident is created. The only option available is “incident is created,” but there is a significant need to change fields and visibilities for a major incident as well. This feature is on the roadmap for approximately Q1/2025.
We do not want agents to promote an incident to a major incident, as mandatory fields only need to be filled in afterward when the major incident is updated. Consequently, an agent can convert the ticket to a major incident and “ignore” the mandatory fields until they want to update some fields. Therefore, we want to remove the “promote to major incident” button, similar to how the “close” button was removed (using a business rule).
Revamp of Permissions:
There will be a significant revamp of permissions in Q1/2025.
Current Issues:
It should not be possible for agents to delete an entry on the Statuspage within major incident management. For all other topics, there are specific rights for creating, editing, and deleting that can be granted as single permissions. For major incidents, it is called “manage” and includes “delete” as well.
Everyone can assign a service to a major incident (MI) that doesn’t belong to their group, which is acceptable. However, why is there a limitation on assigning child tickets to a major incident? We have 160 agent groups, and in the event of a major incident, many tickets cannot be assigned as child tickets/incidents to the major incident because agents lack permission for other groups. This association of child tickets was possible more than six months ago.
A service or an asset should be mandatory for a major incident, according to business rules. Additionally, forcing an emergency change should be considered (not sure if this feature is supported).
Automation - Trigger for Parent Ticket:
There is a trigger for “child ticket got added”, but there is no trigger for “parent ticket got added.” Use case: How do you automate an action towards a single child ticket? For example, you may want to automate a communication towards the newest child ticket, saying: “Hey, thanks for contacting us. Your ticket is part of a major incident in our organization. Please check the Statuspage for more details. We will send you an update once the incident is resolved. In case of any additional questions, feel free to ask.”
Currently, there is no way to automate this for a single child ticket, as automation can only be applied to all child tickets. A workaround could involve fetching the newest child ticket via API and automating actions towards that single ticket. We could implement this as a solution.
Automation - Use Attributes from Parent Ticket:
At present, you can set attributes for all child tickets, but it is not possible to dynamically fetch attributes from the parent ticket and assign them to all child tickets. For example, setting the same category or solution as the parent ticket. Idea: Create an action node for child tickets, setting the same value as in the parent ticket. Additionally, the solution should be an action node, “Set solution as.”
Issue with Statuspage within Major Incident Management:
I apologize, but I cannot omit this as Statuspage will be a major part of our major incident management. I will keep it brief as we have already forwarded our feedback to Ashutosh.
When there is an ongoing entry on the Statuspage, there is no indicator in the UI of the major incident. The UI always opens with the “e-mail thread” tab, and the active Statuspage entry is hidden.
Canned responses for Statuspage do not support subject lines. In the old product FreshStatus, you offered “templates” instead of “canned responses,” which had this functionality.
The date field should be automatically populated (but not disabled, as we might need to change it) from the “incident start time” field.
Sending an update to an entry on the Statuspage that sets every status to operational does not resolve the entry on the Statuspage. You have to click on “resolve” afterward. At the very least, a hint should be shown indicating that the agent needs to resolve the entry, or better yet, a prompt asking whether to resolve the entry on the Statuspage additionally.
Hi @DanielRuff - please no apologies needed! I understand the challenge with the multiple time-zones! Our team was live at that time but since this is an open virtual thread, they can address it at any point!
Appreciate you for the detailed feedback and I’ll let @Anusha and team get back to your feedback! Thanks again Daniel :)
We previously had a MI process in Freshservice, largely revolving around a ‘MI’ tag that would get added to a particular incident, and trigger workflows that way. However, we have now embedded the new MI process…….and are yet to have a Major Incident to really test it. Both a good and a bad thing, I suppose!
I’m keen to hear what other tricks and tips people are using, but one we are in the process of setting up is the on call calendar. We don’t make use of it for any other team in Freshservice, just due to the nature of our set up, however the auto text and call features has allowed us to put our MI manager rota into Freshservice, and build an alert and escalate policy. Not technically part of the OTB MI process, but a nice alignment of functionality
Before putting up your rotation roster in on-call calendar who was paging your teammates done in case of an MI? Was it done manually or you used some other tool?
It was done manually. Every Monday, that weeks Incident Manager would put a note in Teams, saying they were on rota, giving their contact number, and noting when someone was covering for them. Now it is all handled on the on call schedule, and if they are going to be unavailable, for example, if they are on an interview board, or in a VIP meeting, they can provide coverage directly from the schedule
The manual process you had before sounds quite involved, but it's great to hear that the new setup is streamlining things for your team. Just curious if you face any friction moving your team from the old to new setup? Do you have any tips for the transition?
The built in PIR and emails (including draft previews) were enough to sell it. That and ‘I’ve done the hard work of setting it up on the Sandbox, look how easy it now is’
Sorry I missed this - it was a holiday yesterday for my company, so I wasn’t tracking on it.
We use the Major Incident when our production system is down. Our industry is a bit different; we don’t track major incidents in terms of money like most companies would - instead, we look at Officer Safety. Running the National Law Enforcement network, when our message switch is down, lives are on the line. We can no longer get officers the information they need to determine if the person they are pulling over should be considered high risk. The officer will not know who they are dealing with until that first interaction occurs.
We run a 99.937% uptime on our message switch YTD. That’s really good, but every unplanned minute of downtime causes an all-hands-on-deck incident. We figured the major incident would be perfect for this type of thing and we were excited to adopt it.
Like I said, it’s all-hands-on-deck - I need to get approximately 30 people on the phone to aid in troubleshooting the situation and determining where the problem lies so we can get the system back up and operational as quickly as possible. That’s where I ran into my first limitation. I created a Group called the “Major Incident Team”. My thought was that when a Major Incident occurred, I would call the team of 30 folks to get them up to speed quickly right through FreshService. However, I quickly found out that I was limited to only ten individuals on the Major Incident team. My hopes of notifying everyone through FreshService were quickly dashed.
I contacted Customer Support several times and they said they would look at removing the limitation for me. After several weeks though, that limitation stayed in place and I received an email saying it couldn’t be changed.
Due to the limitation, I had to start looking at other solutions that were out there on the market. I looked at additional integrations and settled on Twilio. I purchased a number, certified it for text messages and phone calls and built my own system using TwiML on Twilio. I was quickly able to get by the limitation of the qty of phone calls/text messages, but then ran into my next problem.
I built a custom object with all of the on-call individuals and their phone numbers. My goal was to use a Workflow to determine when a Major Incident occurred and loop through the table to contact all of the employees. Again, my hopes were quickly dashed -- there is no way to use the workflow to loop through a table. You have the ability to get a single record out of a table with the Workflow Automator, but not loop through it. In the end, I created a Workflow “Automator” (if you can call it that) that has 60 actions on it.
Call John at this number and invoke this TwiML
Text John at this number and invoke this TwiML
Call Garrett at this number and invoke this TwiML
Text Garrett at this number and invoke this TwiML
….and so on and so forth, 58 more times.
Any changes to an employee phone number and the workflow must change. Any changes to employment and the workflow must change. It would be so easy to have this just as a table of people who need to know and loop through - modifying table records, but that’s not possible. I have to remember to modify the workflow.
Finally, the Post Incident report that gets “generated” isn’t all that helpful right now. I assume folks who have the AI add-on will get a better generated report with an actual timeline - maybe you could dial in how much detail you want. What might even be better is if you could generate an internal report and an external report. Again, I’m just venturing here. My company won’t pay for AI at the moment. I hope it’s something I can add in the future and that I can justify the costs.
Finally, I believe as the Major Incident matures, it will become a better product. The biggest problem is that when we have Major Incidents right now, handling is clunky!
Sorry I missed this - it was a holiday yesterday for my company, so I wasn’t tracking on it.
We use the Major Incident when our production system is down. Our industry is a bit different; we don’t track major incidents in terms of money like most companies would - instead, we look at Officer Safety. Running the National Law Enforcement network, when our message switch is down, lives are on the line. We can no longer get officers the information they need to determine if the person they are pulling over should be considered high risk. The officer will not know who they are dealing with until that first interaction occurs.
We run a 99.937% uptime on our message switch YTD. That’s really good, but every unplanned minute of downtime causes an all-hands-on-deck incident. We figured the major incident would be perfect for this type of thing and we were excited to adopt it.
Like I said, it’s all-hands-on-deck - I need to get approximately 30 people on the phone to aid in troubleshooting the situation and determining where the problem lies so we can get the system back up and operational as quickly as possible. That’s where I ran into my first limitation. I created a Group called the “Major Incident Team”. My thought was that when a Major Incident occurred, I would call the team of 30 folks to get them up to speed quickly right through FreshService. However, I quickly found out that I was limited to only ten individuals on the Major Incident team. My hopes of notifying everyone through FreshService were quickly dashed.
I contacted Customer Support several times and they said they would look at removing the limitation for me. After several weeks though, that limitation stayed in place and I received an email saying it couldn’t be changed.
Due to the limitation, I had to start looking at other solutions that were out there on the market. I looked at additional integrations and settled on Twilio. I purchased a number, certified it for text messages and phone calls and built my own system using TwiML on Twilio. I was quickly able to get by the limitation of the qty of phone calls/text messages, but then ran into my next problem.
I built a custom object with all of the on-call individuals and their phone numbers. My goal was to use a Workflow to determine when a Major Incident occurred and loop through the table to contact all of the employees. Again, my hopes were quickly dashed -- there is no way to use the workflow to loop through a table. You have the ability to get a single record out of a table with the Workflow Automator, but not loop through it. In the end, I created a Workflow “Automator” (if you can call it that) that has 60 actions on it.
Call John at this number and invoke this TwiML
Text John at this number and invoke this TwiML
Call Garrett at this number and invoke this TwiML
Text Garrett at this number and invoke this TwiML
….and so on and so forth, 58 more times.
Any changes to an employee phone number and the workflow must change. Any changes to employment and the workflow must change. It would be so easy to have this just as a table of people who need to know and loop through - modifying table records, but that’s not possible. I have to remember to modify the workflow.
Finally, the Post Incident report that gets “generated” isn’t all that helpful right now. I assume folks who have the AI add-on will get a better generated report with an actual timeline - maybe you could dial in how much detail you want. What might even be better is if you could generate an internal report and an external report. Again, I’m just venturing here. My company won’t pay for AI at the moment. I hope it’s something I can add in the future and that I can justify the costs.
Finally, I believe as the Major Incident matures, it will become a better product. The biggest problem is that when we have Major Incidents right now, handling is clunky!
That 10 person limit is crazy. I know that the MI group I have set up has more than 10 people in it, and has not (yet) encountered any issues. Hopefully they can get the resolved for you. Once you have that addressed, have a look at the on call schedule option, particularly the Escalation policies. there is an option there to send mesages via autoamted phone call, sms, MS Teams and Whatsapp. Heck, if you had the MS Teams bot set up, you could set a meeting in MS Teams via the Major incident, so everyone looking at it would be able to join. Its a bit of woprk to set up,. but may help bypass some of your issues. Actually, looking at the escalation policies just now, you can send automated calls etc to people who are not in the MI group. So you may be able to get around the 10 member restriction that way
@kenneth.anderson - thanks for the reply. That’s how I originally had it setup. I had an on-call team called “Major Incident Team” and the notification limit was set to 10. If you can add more than that now, that’s great. I’m going to check it out and I’ll report back.
@kenneth.anderson - thanks for the reply. That’s how I originally had it setup. I had an on-call team called “Major Incident Team” and the notification limit was set to 10. If you can add more than that now, that’s great. I’m going to check it out and I’ll report back.
Aha, the notification limit was set to 10! I’m sorry, I completely misunderstood and thought you meant you could only add 10 people to your Agent group! I have not tried contacting over 10 people via the on call schedule, but fingers crossed it works for you
@kenneth.anderson - thanks for the reply. That’s how I originally had it setup. I had an on-call team called “Major Incident Team” and the notification limit was set to 10. If you can add more than that now, that’s great. I’m going to check it out and I’ll report back.
Darn - still looks like the hard limit is 10 names - no change:
Hello folks,
Thank you for registering for the webinar ‘ITOM Unlocked: Master Major Incident Management’ which was held on September 5, 2024.