Monday, September 28, 2015

Response Groups Stop Responding

Our company (Event Zero - makers of the best Skype for Business analytics software out there BTW) relies on Skype for Business response groups for our sales and support queues.  Last week, I noticed that every single one of them on two separate pools were no longer accepting calls.  It wasn't a SIP trunk or mediation issue, because I couldn't get to them by directly entering in their SIP address in the Skype for Business client.  They appeared available (green presence) but would not accept calls. Snooper logs showed they were throwing 480 Temporarily Unavailable errors.

It was especially odd that it happened on two separate S4B pools at roughly the same time.  I tried numerous things, from restarting the RGS service on the affected servers to restarting the servers.  So, yeah, not a lot of tools in my RGS troubleshooting arsenal apparently.

What I found DID work, was to change the Tel URI of one of the workflows to a slightly different number, then changing it back.  Within a few minutes, that particular workflow started working again. 

Rather than doing the same thing to all 20-odd response groups (which would take a LOOONG time because the RGS Workflow web page is so slow), I created a Powershell script to do the same thing.

WARNING: Use script at your own risk. It worked fine for me, but hey I'm not a programmer. Also, this script will use the Description field of the workflow to store the original Tel URI of the response group.  Whatever is there now will get wiped out. You can do this script in other ways, but I was strapped for time and we weren't using the Description field for anything.
$Workflows = Get-CsRgsWorkflow 
Foreach ($WF in $Workflows)
{
 Write-Host "Adding dummy extension to " $WF.Name
 $WF.Description = $WF.LineURI
 $WF.LineURI = $WF.LineURI + ";ext=0"
 Set-CSRgsWorkflow -Instance $WF
}
Foreach ($WF in $Workflows)
{
 Write-Host "Reverting back to original number for " $WF.Name
 $WF.LineURI = $WF.Description
 Set-CSRgsWorkflow -Instance $WF
}

I don't know what caused the problem to start with, but at least this fixed it. Presumably, something happened to "break" the connection between RGS and the associated contact objects, and resetting the Tel URI re-linked them.

Hopefully, this might help others who come across the same thing. If anybody has any insight into why it broke, and why this fix worked, please enlighten me!


Friday, September 18, 2015

Testing Inbound PSTN Connectivity Directly from Lync/Skype for Business

File this one under "It's obvious once you think about it for even a second"...

I was recently attempting to get a SIP trunk working to a new Sonus gateway located in Event Zero's head office in Brisbane, Australia.  I wanted to test inbound calls to the gateway by calling one of our response group workflows.

Normally, I would pick up my mobile or home phone and dial the number to test, but since I was dialing Australia, and I'm a cheap bastard, I didn't want to do that.  It was also the middle of the night in Australia, so I couldn't get someone local to do it for me.  If I dialed the number from Skype for Business, then it would do a reverse-number lookup, find an internal match and route me to the response group internally without going through the PSTN, which wouldn't achieve my goal.

My goal was to route a call to our Australian response group via Skype for Business via the PSTN in the easiest, laziest, least-expensive-to-my-own-wallet way possible (this is sounding like an exam question).  Pick one of the following answers:
  1. Suck it up princess and eat the $2.05 it will cost to call Australia for a few seconds.
  2. Wake up someone in Australia at 2 in the morning, claiming its an emergency.
  3. Convince yourself that since outbound calls work fine, inbound should too.
  4. Use a trunk translation rule to route a dummy phone number to the correct destination.
  5. Give up and start drinking, because its Friday afternoon dammit!
The correct answer is #4 (although the judges will also accept #5).

If you've ever been at one of Doug Lawty's Enterprise Voice sessions at LyncConf or Ignite, you have probably seen his infamous diagram showing how a PSTN call works its way through the system.  To save you the pain, here the relevant details in a very small nutshell:
  1. Dialed number gets normalized to E.164
  2. Skype for Business does a reverse number lookup to see if there's an internal user or service that is assigned that number. 
  3. If there's a match, routes the call internally directly to the user/service.
  4. If there isn't an internal match, it finds a valid route through a PSTN trunk
  5. It applies any outbound number translation to make the number conform to local standards and sends it on its merry way out the gateway to the PSTN.
Since reverse number lookup happens before handing off the call to the routing engine, we can simply call a dummy PSTN number of our choosing, and once it reaches the trunk translation rule stage, change that dummy number back to the proper number assigned to the Response Group workflow.

In the above example, all I do is create a trunk translation rule to change the non-existent North American phone number +1 (555) 999-0000, and translate it to the proper number.  Once the call has progressed to the point where outbound translation is happening, Skype for Business won't be doing another reverse number lookup, so it will continue to route that number out to the PSTN, at which point it should route to the correct destination (which could be right back where it came from). 

So, a simple solution to overcome my unwillingness to waste money on stuff I don't have to.