HOW CAN WE HELP YOU TODAY?

1
Knowledgebase: Announcements
Austin Outage issue
Posted by Amit Si on 25 April 2014 12:33 PM

[Updates 12:16 I.S.T. July 31, 2014]

I want to take a few minutes to highlight some of the recent updates & milestones in what has been a long and complex restoration effort over the past few months. Many of these efforts have been described in our earlier posts, and I am recapping them here briefly.

  • Our restoration efforts have been focused into two distinct phases as I had highlighted earlier. The first phase involved operating with large groups of servers and customized software to reconstruct email data. Through this phase, we were able to recover a significant percentage of the email data for many of our users and have been communicating individually with affected email users about the recovery of their emails.
  • The second phase of restoration effort involved working with low-level bit-by-bit techniques that were more experimental in nature. Our goal as I have stated before has been to leave no stone unturned and to try every feasible approach - and this phase was our commitment towards that. In terms of results, this second phase of restoration did not result in increasing our recovery yield by a significant amount, i.e., much of the recoverable data had already been addressed in the first phase.
  • We have been able to recover email data for 45.6% of the total accounts that were on the affected storage unit, 62.7% of all active accounts which were accessed via IMAP, and  71% of all active IMAP accounts with a mailbox size greater than 10 MB. In totality, we recovered 30% of the total data on the storage cluster that failed. As I had highlighted earlier, given the nature of the outage we have not unfortunately been able to recover all data for all users. My purpose in sharing these stats is to continue to provide meaningful updates and honor our commitment of transparency in our efforts and results. These percentages in no way are suggestive of either our commitment to each and every affected user or to the quantum of our efforts. Our efforts were equivalent for each and every user with the aim of recovering everything that was possible.
  • Over the past few months we have built and leveraged all the possible resources and technology we could with the goal of reconstructing all the data where possible to do so. As such, I am confident that we have explored and followed all all possible avenues. At this time, while certain tactical efforts will continue - we are closing the broader and sustained effort.
  • Our efforts have not resulted in us recovering all the data, and we are truly and indeed sorry for this.

Next Steps

  • Our customer care and technical support teams remain available  to take your questions and provide you with any additional questions related to your account or this outage.

  • While we will close this particular forum post and thread, my team and I are here for you all times as we work behind the scenes. On their behalf, I thank you for your patience and your understanding.

Dushyanth Harinath
Sr. Director, Systems Operations

==================================================================

Updates on the email outage at our Austin data center

[Updates 18:20 I.S.T. June 16, 2014] 

The email recovery process is proceeding in the manner that I had highlighted earlier and as of writing this we have completed the restoration for a large number of email accounts as part of our Phase 1 efforts. The recovery process works by incrementally syncing any additional emails that we recover to your 'restored_emails' folder and this process will continue each day until we have determined that no more data can be recovered. This recovery effort continues to be a significant effort and priority for my teams and myself personally. Please stay tuned for further updates. In the meantime, please feel free to connect with our support teams if you have any additional questions. Thank you.

[Updates 12:00 I.S.T. June 2, 2014] 

As detailed in my last post, our team has been in the process of restoring data in two distinct phases which use different technology/processes. As of this update, I am happy to report that the first phase of data recovery is nearing completion. Large amounts of data (multiple TBs) has now been processed as part of the Phase 1 data restoration project. At this time, our team is working on running a secondary process to sync this recovered data with each affected user's email account. Given the large volume of data, we expect this process to continue for the several days. Emails that are being restored are being placed in a distinct folder called 'restored_emails' in your account. We continue to correspond with individual users via email during this time to keep them abreast of progress for their accounts.

While we work on completing this first phase, the recovery team is simultaneously working on an additional advanced recovery process ('Phase 2') which is a much more intensive and low-level bit-by-bit restoration that may allow us to recover additional data that was not recovered during the first phase. Please stay tuned for further updates on how this second phase is progressing. In the meantime, please feel free to connect with our support teams if you have any additional questions. Thank you.

 

[Updates 12:00 I.S.T. May 22, 2014] 

 

As mentioned in our last post, we are working on recovering data in phases. Phase 1 of the recovery process is currently in progress and we have started to recover emails across affected accounts. Phase 1 will continue for at least a couple of additional weeks. Our customer facing teams are reaching out to customers via emails to inform them of the status of recovery for each account and we have done this across a large number of affected users already. This communication is going out every day now as we continue to restore additional data.

 

While the Phase 1 recovery process is running, the data recovery team is working on building a Phase 2 recovery program (which is a variation of the Phase 1 program but utilizes additional techniques to support even lower level bit-by-bit data recovery) simultaneously. With the Phase 2 process we expect to recover additional data that was not recovered during the Phase 1 process.

 

OUr engineering teams continue to be fully engaged in this activity and are leaving no stone un-turned. We am fully aware that we are now into the 4th week of this recovery process. However, given the complexity of the effort we are expecting this to take several weeks more. We thank you again for your continued patience. This remains a very critical issue for us and will continue to be until we exhaust each and every avenue available to us to recover your data. Thank you.

 

 

 [Updates 18:30 I.S.T. May 21, 2014] 

In the aftermath of the email storage outage on 24th April, we had setup interim storage devices so that mail services can be resumed for affected accounts. In the meanwhile, we have taken efforts to create a more hardened storage infrastructure.

We will now be migrating the affected accounts from the temporary servers to this infrastructure. The maintenance is planned to be conducted as below :-

 

Date :- 22nd May 2014
0700 hours IST :- Mail delivery and access to accounts will be stopped. Inbound mails will be queued up and delivered to respective accounts after the maintenance is completed. Sending of emails will still work, however, they will not be saved to 'Sent items'. Migration activity will begin from the current storage to the permanent one.
0900 hours IST :- End of migration & Mail delivery will be enabled. Inbound emails that were sent to our server during the maintenance start to be delivered.

 

We regret the inconvenience this may cause; thank you for your continued patience. If you have any queries regarding this, feel free to reach out to our support teams.

 

[Updates 10:50 I.S.T. May 17, 2014] 

Summary

 Email restoration process updates

  • Looking ahead
  • FAQ
  • Next steps

 Email restoration process

The effort our engineering teams have put in over the last few weeks is starting to yield some concrete results. Over the past few weeks, the engineering team has built a multi-phased automated recovery and restore process for the storage cluster. The first phase of the recovery process is to recover files that are undamaged or are partially damaged. At this time, we are recovering data in excess of 500 GB of data a day and we expect that this phase will complete in about 2 weeks under current conditions and estimates. Once we conclude this first phase - we will be launching future phases to recover additional types of data using different means.

Over the last few days we have managed to restore selected emails for a small number of user accounts as part of this first phase of operations. It is important to note that the restored emails in this phase may not be the complete set of emails in these email accounts. Once we complete the first phase of operations highlighted above, my team will continue the process in subsequent phases to restore additional data, if possible. As explained in our previous posts, we are unable to predict or place any guarantees on the quantum or percentage of recovery for specific users. However, we remain committed to recovering all the email data that we possible can through this multi-phased processes. We have started to reach out to users individually where required to keep them apprised of this progress.

Looking ahead

I spoke about revisiting some of our systems and processes in light of this outage in my previous post. We take a lot of pride in the faith you put on our services and we engineer all our products to meet the highest availability & safety standards - right from selecting the best in class data centers, hardware, ISPs, DDOS mitigation, storage systems - to building processes that allow our teams to efficiently operate and manage our systems. Post this outage, my team and I have conducted a detailed reviewed of each of these systems we have in place and also conducted meticulous planning for additional scenarios to boost our preparedness for a wide range of possible issues.

Our primary objectives as part of these sessions have been to review and analyze processes and systems:

  • To ensure consistent off-device backups for disaster recovery of our systems
  • To automate recovery and restore processes in case of hardware failures
  • To perform and conduct frequent drills that test our operational readiness for various kinds of outage scenarios

Our systems have always been designed with these objectives in mind. We have always built our systems to very high standards and provided you with strong services and SLAs across a range of products for well over a decade. However, this detailed review has highlighted a few specific areas where we can improve and my team and I are working rapidly to resolve these gaps.

FAQ

I am posting here some some additional FAQs for users who emails may have been recovered partially / or not at all and have received an indication to this effect.

  • Why is my account not restored yet?
  • It is likely that we have not yet reconstructed the files that correspond to your email account. As explained above, the recovery and restore process has commenced and is a multi-phased activity. Restores will continue as and when data is recovered. The process is completely automated and attempts to recover large chunks of data in an efficient manner and as it rebuilds files we are working to place them into your account.
  • I got an email which said that my account is restored but I don't see all my folders/mails?
  • Unfortunately, the way this cluster failed made some data completely irrecoverable so it may be possible that we have managed to recover all the data we could and will be unable to recover any additional files. It is also possible that this data might be recovered in a subsequent phase of data recovery. We will only know when the process has completed in its entirety. 
  • My account was restored, and I see some emails but am not able able to open or view them?
  • If you encounter this issue, please get in touch with our support team and we will take a look.

Next steps

In conclusion, I want to reiterate the following.

  • Thank you for the faith you have shown so far - and like I said earlier, this continues to be my and our engineering teams' highest priority at this time. We are moving as swiftly as we can to recover and restore data.
  • I will keep communicating important updates at every step.

 

Updates on the email outage at our Austin data center

[Updates 12:26 I.S.T. May 14, 2014] 

Summary 

 

-          Post-mortem findings

-          Details on the email restoration effort

-          Frequently asked questions

-          Next steps

Post-mortem for the outage

Our senior tech and management team has concluded a detailed root-cause-analysis for this outage and here are the findings. At Around 2 PM IST on Thursday April 24th, we were in the process of provisioning a new storage cluster at one of our global data centers. As part of this process, an aggregate (which is a collection of multiple disks spanning RAID groups) holding the production data and backup snapshot volumes for some of our email users was rendered inoperable while attempting to build a new aggregate. 

We noticed the services on this storage cluster failing almost immediately and this was highlighted by our network and systems operations team which operates 24x7 across all our facilities and services. We immediately halted the offending deletion/new aggregate creation processes as soon as we detected the issue. Our first goal at this point was to restore email services, which we did as soon as possible, by migrating all users on this storage cluster to a different cluster. We simultaneously started work with a team of experts to re-build the storage cluster and to bring it back online - and those efforts continue as of date with more updates in the following sections.

In response to this incident, our senior technical team immediately put into place a series of stringent change control measures and oversight, which went over and above the systems already in place, to ensure that no further opportunity exists for an outage caused by a similar event in the future.

The email restoration process

At around 10 PM IST on Thursday April 24th, we initiated a process of bringing back the storage cluster online by constituting an advanced team dedicated to this purpose. This process involved reconstruction of individual files across the storage cluster ground up. This restoration process as we have highlighted earlier is very time consuming since it requires us to reconstruct the storage cluster and all of its components from the ground up. A team of engineers has been working on this effort non-stop from the day of the outage. The restoration effort involves a multitude of software, scripts, hardware systems, and manual inspection processes to carefully rebuild the cluster. We were advised at the very onset that this process will take several weeks and our communication to you has been consistent with this. A secondary task force composed of our senior most engineers and managers has been reviewing progress every day and continues to do so.

As our engineers have progressed further on this restoration effort, they have also advised us that certain files on the aggregate are not-recoverable. They estimate that at least 25% of the underlying data (not be confused with users/mailboxes) on the storage cluster is not recoverable. We do not have the ability at this time to tell you what this means for your individual mailbox or your account. However, this restoration effort continues and will continue until we are certain that we have done everything in our power to restore emails for each and every affected user - though we cannot guarantee the end results – and we apologize for presenting you with this uncertainty. Our goal has and will continue to be to share meaningful updates as soon as possible. 

Frequently asked questions

Over the past few weeks, we have had conversations with a number of you about this outage. We have created the following FAQ based on the questions we have heard most often. We look forward to addressing any additional questions over email / phone.

What about backups? Don't you have any?

We spoke about this earlier in our post from April 26th. Our backup strategy was to create periodic snapshots of the email data. Given that these snapshots were stored on the same storage cluster which is now offline - we have no access to them at this time.

Why is this taking so much time?

The reason the team is taking the time that they are is as follows. They are working on building the meta-data of mailbox files for each individual account on this storage cluster. They are writing scripts/tools to assist with both the manual and automated restoration of files on the cluster. They are running tests to validate that the restored data is valid and usable. All of these processes require building a series of complex software systems and specialized hardware to operate on those systems. The team also has to double back often and try out new approaches when certain efforts do not yield the desired results. We remain committed to allowing this team, which we believe is composed of some of the best engineers in the industry, time to complete this effort.

Is my data lost? Will you ever be able to recover it?

We cannot and don't want to promise specific outcomes given that the restoration process has not progressed to a stage that will allow us to do this. At some point over the next few weeks (and we don’t know the exact date since this is a complex effort akin to a reasonably large software/hardware engineering project) we will know definitively the status of each and every affected email account. At that point we will communicate specifically with you to talk about the results that our engineering team has been able to generate. At this time, the goal for this team is to re-build and re-construct the cluster and we request your patience while they continue to do that. 

These answers are not sufficient - you need to tell me more

We are sorry if these answers appear to be insufficient. This forum post reflects the most recent update we have at this time. Please realize that our front line support team and our senior managers are all working hard to create the best possible outcome they can under the circumstances. We would be glad to answer a different question if you call or raise a ticket with us.

What are you doing to prevent this or something like this from happening again?

We realize that this incident impacts your trust in us and our services. When we learned of this outage, our senior most technical engineers immediately put in place a series of measures to ensure that a similar outage or issue would not happen again on our systems. We also commenced a detailed deep dive into our systems and processes which goes far beyond this particular incident with the goal of demonstrating the rigor and confidence with which we build and deliver our services to you. We do not take this responsibility lightly. We know you deserve more detailed understanding of the work we are doing in this area and we will share this with you over the coming weeks.

Next steps

In conclusion, we want to reiterate the following.

  • - We are very sorry
  • - We are working hard to try and bring this issue to the best possible resolution
  • - We will continue to share all meaningful updates

Thank You.

 

[Updates 13:35 I.S.T. May 9, 2014]

Our engineering teams are continuing the restoration effort. As highlighted earlier, this process will take several weeks since it a complex engineering effort involving multiple engineers, locations, hardware and software resources. Given our goal of providing meaningful information to you as it becomes available, we will post on this forum as soon as we have significant information to share.

In the meantime, we remain available to discuss this with you at all times through our regular support channels i.e. via calls on 91-22-30797979 and on tickets at https://support.bigrock.com.

Thank you as always for your patience.

================================

Updates on the email outage at our Austin data center

 

[Updates 11:20 I.S.T. May 8, 2014]

At this time we have no additional information to provide over and above our update from yesterday. Efforts continue unabated and the primary purpose of sending this update is to continue our engagement with you every 24h as promised and to reassure you that we continue to drive this issue with the highest priority.

================================ 

[Updates 9:00 I.S.T. May 7, 2014]

Our engineering teams are continuing to attempt email restoration through a number of approaches at this point - both manual as well as automated. As highlighted earlier, a large part of the effort to date has been towards building processes, deploying hardware, and developing software and scripts to drive this project further.

Our engineering team continues to emphasize that while they are making progress, the expected timelines for completing the recovery effort is several weeks, and even at the conclusion of those efforts we have no guarantees on the recoverability of emails for specific accounts. Over the next few days, our engineers have indicated that they may be able to partially recover some emails for certain users as they make their initial pass through their system and we will continue to communicate with those users as and when that happens. In the meantime, we appreciate very much your patience and understanding while our teams continue their efforts.

 
=========================
 
[Updates 09:10 I.S.T.  May 6, 2014]

We have no significant updates to present today related to the restoration process. Our tech teams are attempting a number of restoration methods that are intensive in terms of people, hardware, and time. These efforts are currently underway 24x7 across our global teams as indicated earlier. We will continue to update this thread through this process.

IMAP services have now been restored as of today to all of our customers as promised in our update dated May the 1st. Please reach out to our support team if you are still having issues with IMAP.

Thank you as always for your patience and understanding 

=========================

 

[Updates 11:30 I.S.T.  May 5, 2014]

We have restored IMAP services for all customers that are affected by this outage. You will now be able to access your emails from multple devices

=========================

 

[Updates 17:00 I.S.T.  May 4, 2014]

- At this time we have no additional information to provide over and above our update from yesterday. Efforts continue unabated and the primary purpose of sending this update is to continue our engagement with you every 24h as promised and to reassure you that we continue to drive this issue with the highest priority. 

- This forum will continue to be our primary mechanism for providing updates as soon as available. In the meantime we truly appreciate your patience. Thank you.

========================= 

 

 [Updates 17:00 I.S.T.  May 3, 2014]

- As highlighted earlier our engineering team continues their effort of creating a framework to allow the restoration activities to continue in their second week. They are making fair progress towards this goal.
- We recognize that this process appears to be taking a significant amount of time. However due to the nature of the restoration process - this activity will stretch into several weeks as we had indicated earlier. Having said that the intensity of our recovery efforts has not lessened at any time.
- We look forward to connecting with you in 24 hours or earlier if anything changes.

=========================

 

[Updates 20:00 I.S.T.  May 2, 2014]

  • Recovery efforts continue at this time across multiple shifts over a 24x7 timeline. We are continuing to leverage all possible resources in terms of people & hardware to drive this process as aggressively as possible. The team has been working on this activity without pause for one week now. At this time, the engineering team is working on mapping the structure of the storage cluster and in creating a blueprint for recovery efforts moving into the second week.
  • We have no updates on the recoverability of email across specific users at this time - but a key focus of the recovery effort is to understand this aspect in detail.

Thank you as always for your patience. We realize that this has been a challenging week. We are doing our absolute best and our senior technology & management teams are monitoring this process very closely. Our next update will be in 24H or earlier in case of any additional developments

=========================

 

 

[Updates 20:11 I.S.T.  May 1, 2014]

Restoration efforts continue around the clock as indicated earlier and unfortunately we have no further updates to share at this time other than that which was shared previously. We will post any additional information on timelines or progress on this forum as it becomes available. Thank you for your continued patience and understanding.

 

=========================

 

 [Update 13:05 I.S.T.  01/05/2014]

Our Engineering team continues work on email recovery and to bring the storage cluster online; we are starting to see some limited progress. This process is very time-intensive, and we continue to work 24x7 on it.  As we communicated earlier, it is likely to take several weeks to complete this process.


We have scheduled IMAP service restoration for Monday, May 5th. We will continue to post updates here as they become available.

Thank you for your continued patience and understanding.

 

=========================

 

[Update 21:00 I.S.T.  29/04/2014] 

Summary 

  • Mail restoration efforts continue
  • Information on IMAP availability

 

Updates on recovery process 

The process to bring back the storage cluster continues at this time. We have started to see some progress from the engineering team around recovering critical pieces of meta-data that will aid in rebuilding the storage cluster and identifying the contents of and starting the process of recovering the emails contained therein. Please note that this process is highly time-intensive since it requires rebuilding individual files distributed across multiple and redundant disks & re-constructing inbox structure for all affected users piece-by-piece. As highlighted in yesterday's update - this will take several weeks to complete. While our engineering team remains hopeful that we will be able to restore a portion of the data, we are unable to share specifics on exactly how much or what is recoverable at this time. It is our commitment to continue updating you throughout this process.

 

Activities in the next 24H 

Here are the broad set of activities our various teams are working on over the next 24H

  • Email recovery process continues as explained below - we are monitoring the progress of & driving this round the clock - efforts will continue 24 hours a day, 7 days a week
  • On 25/04/2014, we had shut down IMAP services for all affected email users to allow emails they had stored offline (via clients like Outlook & Thunderbird) to be backed-up on local storage and prevent those offline emails from being deleted due to an IMAP-sync. We are now starting the process of reaching out with specific instructions to allow users to store those emails in local folders in preparation for turning on IMAP services for all users. We will be sending out detailed communication around this over the next 24H. Please reach out to your support team with any additional questions about this. 

Timeline for next update

We will send out the next update in the next 24H (or earlier in case of any other important developments).

 

=========================

  

[Update : 09:00 I.S.T. 29/04/2014]

How did this happen?
We were bringing a new storage cluster online, and during this process data was inadvertently removed from an existing storage cluster.

What are you doing to get my mail back?
The removal of this data has made it very challenging to restore your messages. As we’ve mentioned, we are working with storage industry experts to reconstruct the data, work continues around the clock towards this end, but this is a very time consuming process, and it may take a few weeks to complete the entire process.

Are my messages gone forever?

It is possible that some messages of your messages may be unrecoverable, however at this time we have not completed our work on the restoration. We will continue working on our recovery efforts, with an eye towards restoring as many messages for as many customers as quickly as possible.

 

Should I be worried that this is going to happen again?
The error that occurred was highly unusual. We are carefully analyzing what happened and are implementing additional safeguards to prevent this particular event from reoccurring.

 

=========================

 

[Update : 23:14 I.S.T. 27/04/2014]

Apologies for the delay in updating the forum post.

We assure you that this issue is being treated with the highest priority and our System administrators are working round the clock with experts from the storage industry to restore access to all services. Unfortunately this is proving to be a very time consuming operation and as of now we cannot provide you an ETA on this but we will make sure we update this forum with as soon as we have any new information.

We deeply and sincerely regret the inconvenience caused. We appreciate your patience and co-operation.

 

=========================

 

Updates on the email outage at our Austin data center

[Update : 22:00 I.S.T. 26/04/2014]

Why is bringing this storage cluster online taking so long? 

The way in which this storage cluster went offline has required specialized techniques to restore.  As we mentioned previously in this post, we have some of the foremost experts in the industry working on this around the clock.

When will my mail be available? 

At this time we cannot confirm a concrete timeline.

Can’t you restore data using backups?
 
As mentioned in our last update, the storage cluster failed during a planned activity to bring another storage cluster online. Our backup strategy here consisted of creating periodic snapshots of the data. These snapshots were stored on redundant & highly-available systems that comprise this storage cluster. Unfortunately, given that the storage cluster is not functional at this time, we do not currently have access to these backups.

When can I expect the next update?

We will continue our updates – and provide you with one every 24 hours – or sooner if we have additional details to share. Thank you for your continued patience.

 

=========================

 

Updates on the email outage at our Austin data center

[Update : 13:25 I.S.T. 26/04/2014]

As previously mentioned, our Engineers are still working round the clock to try and restore all mails from the storage units. We still do not have any ETA on this but we will make sure we update this forum with relevant information as soon as we get any update from them. We appreciate your patience and co-operation for the same.
 
Any inconvenience caused due to this is deeply regretted.

 

=========================

 

Updates on the email outage at our Austin data center

[Update : 09:15 I.S.T. 26/04/2014]

While we are working to restore access to all services, we have taken some additional proactive steps to ensure that users who are currently using IMAP and have older emails stored (downloaded through their local clients such as Outlook or Thunderbird) continue to have them saved through this outage. Detailed information pertaining to this change is given below:

What are we changing?
 
We are disabling the IMAP functionality for a subset of domains that were affected by the email issue.
 
Why are we disabling IMAP?
 
While the restoration of the historic data is still in progress, we are trying to ensure that the users who have a local copy of their emails do not get affected when they try to sync their email accounts via an email client like Thunderbird or Outlook. Once IMAP is disabled, there won't be any changes to the emails that are currently synced.
 
‚ÄčI don't want to use webmail. How can I continue to use my mail client?
 
We strongly recommend you use webmail to access your emails, However if you wish to use your email client, you may have to use POP for your email account keeping your previous IMAP settings intact.Here are the detailed steps explaining how to use POP for your email account.
 
https://support.bigrock.com/index.php?/Knowledgebase/Article/View/201/7/various-email-client-configuration
 
Important Note:
Please note that you should NOT remove the existing IMAP account. You have to create a new POP account in the same email application (outlook/thunderbird etc).

Thank you for continued patience and support.

 

 

=========================

 

[Update : 20:00 I.S.T. 25/04/2014]

 

Here is a broad time line and summary for this outage:

 

- At around 14:50 IST on April 24th we detected that one of the storage units which serves part of our email infrastructure became non-operational causing email services for a subset of our customers to not function.

 

- We immediately started a process of restoring access to this storage unit and our engineers worked dedicatedly over the next few hours to restore functionality to affected email accounts.

 

- As of 22:00 IST on April 24th access to all email accounts affected by this issue was restored. All affected accounts were at that time able send and receive email via webmail and through external clients.

 

- Email that was queued up during this outage was relayed to individual email accounts after email services were restored and should be working without pause at this time.

 

- As of this update, we are still working to restore this storage unit to a fully operational state. This process is taking longer than we originally anticipated.

 

- We currently have some of the best storage unit experts in the industry working round the clock to bring the unit back online. We are currently being advised that this process can take several days to complete.

 

What is the current impact? Is there still something which is still non-functional?

 Many customers that were affected by this downtime are currently unable to view emails received prior to this outage, specifically, those emails that were stored on the storage unit that suffered the outage and were being accessed either via webmail or IMAP. If you were however downloading your emails to a local email client such as Outlook or Thunderbird using the POP protocol – then there is no further impact to your service due to this issue.

 When will you fix this issue? What can I expect in the interim?

Our goal is restore access to your stored emails at the earliest possible and we are sparing no effort towards this. We care deeply about ensuring that you have best-in-class services and we apologize deeply for the inconvenience caused to you by this outage. We are now being advised that bringing up the affected storage unit is, unfortunately, not a matter of hours but of days. In the interim we will endeavor to keep you updated as best we can with any and all meaningful updates because we do care about getting you back up and running fully. As much as it pains us, we cannot offer a concrete ETA at this time but will continue to update you regularly.

 What happened? Can you explain in detail what really went wrong?

 One of the storage units that stores email for a subset of our customers went offline as a result of a set of pre-planned activities we were working on, to bring a new storage cluster online. We build our systems with multiple layers of fail-safes and safeguards, and we are still in the process of doing a post-mortem on what we could have done better to prevent this from happening. We promise you a full account of this when we complete this investigation along with a detailed summary of steps we will take to prevent a similar event in the future as soon as we complete this analysis.

 What can I do if you have additional questions?

 While we realize you will still have questions, we hope this post adds to your understanding of this outage. We promise to be here to work with you and provide transparent & meaningful updates as soon as we are able. This post will be the primary medium of communication with you and as such we are directing our contact center agents to refer you to here so we can have clear and consistent communication with you. However, we will continue to be available on 91-22-30797979 and on tickets at https://support.bigrock.com should we be able to offer any additional assistance during this time.

Thank you for continued patience and support.

 

=========================

 


[Update : 1700 I.S.T.]

Apologies for the delay in updating the forum post. We have been working round the clock to restore access to mails received prior to this outage. Newer emails should be sent/received without any issues. 

We assure you that this issue is being treated at the highest priority and a majority of our resources and man power has already been redirected towards fixing this. Despite our best efforts we are unable to give you an ETA at the moment. We request you to bear with us while we update you with further progress on the issue.

Thank you for your patience and co-operation. 

 

=========================

 

[Update : 2130 I.S.T.]

Currently most of the mail accounts will be accessible and you should be able to send and receive mails. The emails that were queued up on our inbounds since the start of the outage is being released slowly and all such mails should be delivered in another hour.
 

=========================

 

[Update : 1800 I.S.T.]

At around 14:00 IST/ 08:30 GMT today, one of our storage units which hosts email accounts went down. We are currently working with our storage vendors to bring it back up.

The impact of this is that accounts hosted on this unit are not able to access their emails 
over webmail, pop & imap. These accounts still have the ability to send out 
emails and all incoming mails are being queued on our inbound smtp severs for 
subsequent delivery.

In the meanwhile, we are setting up an additional channel so that new emails (received after 14:00 IST) can be received. 
The ETA for this is 2 hours.

Once the storage unit is fully functional and we complete analysis, we will restore access to old emails. 

 

=====================================

 

[Update : 1530 I.S.T.]

The problems have been identified as issues with storage unit. Our System Administration Team are hard at work in getting this fixed. Stay tuned to this post for more updates.

 

=====================================

 

Reported issue : 1450 I.S.T.

We are currently facing intermittent issues with mail service hosted on our Austin mail servers and Enterprise Email. If your domain’s MX records are pointing to the below MX records, your services might be affected :-

  •     us2.mx1.mailhostbox.com.
  •     us2.mx2.mailhostbox.com.
  •     us2.mx3.mailhostbox.com

or

  • us3.mx1.mailhostbox.com
  • us3.mx2.mailhostbox.com
  • us3.mx3.mailhostbox.com

You can check your domain's MX records using the link - http://www.mxtoolbox.com. You might face errors while login on to your webmail account

 Our System admin team is already working on this and are trying to fix this asap. We sincerely regret the inconvenience caused.


Please watch this thread for further updates.