Thursday, January 12, 2012

Microsoft Lync as Enterprise Voice....No Thanks

There is a lot of hype in the marketplace about Microsoft Lync and how it is a great enterprise voice platform. While I do like the Lync product for  IM and presence I believe it has some flaws when it comes to some conferencing situations and enterprise voice. Before I go any further I want to remind the readers of this article that you are reading my blog and as such contains my opinions. I definitely welcome a healthy discussion and I do not claim to be the all knowing expert in all things.

I have been working with Lync for a little while now and as a result I have done a lot of research and troubleshooting. During the process of implementation and testing I have learned a lot about Lync's nuts and bolts. I will attempt to explain why I personally do not believe Lync is ready to be an enterprise voice system. I do believe Lync could be a good voice platform for a small to medium business if you have a good network in place and you aren't trying to make too many calls over a WAN connection. Now, let's get past the shiny paint job and look under the hood.

1. Call Admission Control (CAC)
Call Admission Control is an umbrella term for the mechanisms and practices of regulating traffic volume on a network, particularly when speaking of voice and video traffic. Call Admission Control can also be used to provide a certain level of voice and video quality or a certain level of performance on a network. Typically CAC is most important on slower WAN or internet connections.

When Microsoft originally put OCS on the market they would tell you that there was no need for Call Admission Control and Quality of Service (QoS). (More on QoS in a minute) Microsoft believed that their codec, RTAudio could handle it all dynamically (More on RTAudio later too). In a small to medium business with no WAN links on your network you probably could go without CAC and if it were really small you could try it without QoS, but not in an enterprise.

Microsoft's answer was to add CAC policies when Lync hit the market. Now you would think the problem was solved right? Nope. CAC in a Lync environment takes place at the application layer. While the Lync approach to CAC can limit the number of Lync sessions routed over a particular network link it has no idea how much traffic is actually on that network link. The application layer is not network aware. The only way you could have a reliable CAC approach using Lync CAC is if you dedicated a network link for the purpose of carrying Lync sessions. Once anything else runs on that network link you are risking over subscription. CAC still needs to happen at the network layer.

Another flaw I see in their CAC implementation is the reliance on the receiver end Lync client. Yep, the client, not the server. If you are in site A and you make a Lync call to site B and the network connection between the sites has a CAC policy associated to it here's how it works.

When the Lync clients in site A and B load up they get a CAC policy from the server they register with. When site A client calls site B client it will be up to the site B client to decide if the CAC policy will allow the call. If your network link is already congested you just sent even more traffic over that network link just to get your call denied. If you ask me, that's bad design.

2. Quality of Service (QoS)
It is true that Microsoft supports DiffServ for QoS tagging, but QoS and CAC are all off by default. My big gripe with the way they implement QoS is how complex it is.


You have to create policies for in the Lync server for each these services, roles, and applications:

  • A/V Conferencing service
  • A/V Conferencing Server
  • A/V Edge service
  • Application sharing engine
  • Mediation server
  • Response Group application
  • Conference Announcement service
  • Application based on the Unified Comunications Managed API (UCMA)
You can use Active Directory to push QoS policies to any Windows 7 or Windows Vista clients which doesn't sound too bad, but if you have any older Windows operating systems such as XP you will need a group policy to enable the packet scheduler service. Then each client machine will need the Lync Server Management Shell installed. After the shell is installed you will need to run two commands to get your voice and audio packets to tag for QoS. After that, in my opinion the Lync Server Management Shell should be uninstalled. Depending on your environment you could end up in a very high touch situation. 



3. RTAudio Codec
I am sure I will take some heat for this bullet. Let me start off by saying I don't mind the RTAudio codec. I actually think it is pretty decent, but there are some pretty serious implications to consider.

RTAudio is a codec that provides algorithms for dynamic compression, but it does these compression changes over the course of a few seconds. It also provides for some error correction, but the most dangerous gotcha in my opinion is its ability to detect a lossy network and send redundant packets. What's that? You say that sounds really good?

Well, if you have a fully remote workforce I agree. If most of your calls are going over the internet redundant packets are great! You should use any means necessary to try and guarantee your packets get to the destination. However, if you are making a lot of calls across a WAN link and it gets congested you do NOT want all the calls going over the WAN link to start doubling their packets. If you had 20 calls running over a link that gets congested you just turned 20 calls into 40! A small business may never run into this situation, but an enterprise surely will. Some people will say "buy more bandwidth", but that is not a practical solution and it is definitely not a quick or inexpensive fix. RTAudio is a codec best suited for internet telephony solutions. I would rather not use it in my enterprise.


4. Conferencing Bandwidth
A small business could even run into an issue here. In this section I am referring to multimedia conferences. In other words, conferences that involve video, screen sharing, audio, etc. Each conference participant will get approximately 550k-700k media stream. Let's say we have one engineer in the US talking to three engineers in China. That will be up to four media streams traversing our WAN link to China. Anywhere from 2.2-2.8 Meg is being used up by only four people in a conference. A small business may be able to handle that over a WAN link, but we have conferences with 20- 30 people regularly. This is not a recipe for scaling out. Microsoft will tell you that hundreds can be in a conference. In theory I am sure they can, but in practice you are better off using a service like GoToMeeting or WebEx.

In summary, I think Lync has some potential and I think there are aspects of it that are great today, but in my opinion enterprise voice is a bit of a stretch. I am sure there are examples of it working out there. I am not saying it can't work, but I am saying it isn't as great as the marketing would have you believe. In  my opinion there are some fundamental technology flaws that need to be addressed. If you choose to roll with Lync as an enterprise voice system I would tread carefully and do plenty of testing.











35 comments:

  1. Right on Brother!

    ReplyDelete
  2. Ahh, a network engineer that has discovered voice.
    Proper QOS and circuit analysis incl Erlange will fix your RTAudio issues. Design the circuit capacity correctly and you will not have any issues.

    ReplyDelete
    Replies
    1. First of all I would like to thank you for taking the time to read my article and respond.

      I actually started learning voice before I got into networking. Then VoIP came along and I have been increasing my skill set on both ever since.

      As to your comment, I am not concerned with my voice circuits. Those are limited in number any way. I am also not terribly concerned about RTAudio on a LAN network. Theoretically, you could run into issues, but probably not. My biggest concern lies in the WAN and MPLS connections. It isn't economical to have overly large WAN pipes so traffic management becomes very important. With traditional CODECs such as G.729 you know what your payload size will be. You have a very predictable network. If RTAudio is your CODEC there is no way to know exactly how much bandwith each call will be using, especially since the RTAudio CODEC will double its packet stream if things get congested. All the analysis in the world will not prevent that from happening and you can't adjust the RTAudio behavior to change that.

      It is true that I could use QoS to separate my voice and data over a WAN or MPLS connection, but QoS is used to expedite voice traffic while cutting back data traffic. When speaking about RTAudio if I use Call Admission Control to limit the number of phone calls I still cannot guarantee how much bandwidth it will use.

      For example, I set CAC to allow 5 Lync calls to cross my WAN connection. I apply QoS to expedite voice traffic. Theoretically, you would never have a problem right? If latency or jitter is introduced any where along the line, which is highly likely considering the fact that many people will deploy Lync clients as a soft phone and now you are introducing all the potential issues of a PC, RTAudio may decide that it needs to double its packet stream. Now the 5 calls your CAC policy allowed has effectively become 6 calls. With QoS you are still wanting to expedite voice traffic so you send it on and your data network suffers. If this same scenario became worst case your 5 calls become 10 and your data network suffers further.

      Maybe you decide to build your QoS so that voice traffic is expedited, but capped. RTAudio will still double its packet stream and now every call will suffer but your data will be fine. In this case as soon as one call doubles the packet stream they all will double their packet streams exacerbating the problem.

      The unpredictable nature of RTAudio is risky in a network design. When voice transitions to VoIP it is no longer a voice problem, its a network problem and that's why you need a network engineer to deal with it.

      If you have a mobile workforce and the vast majority of Lync calls will be made over the internet I think RTAudio makes a great choice and will likely out perform traditional CODECs such as G.729. In a situation like this Lync voice could be a great option, but in my opinion Lync voice is not the best choice available for voice inside your firewall and across your WAN.

      Delete
    2. I agree with the previous comments that you have to know what your network is capable of before deployment. No matter what service your deploying if your network isnt setup for it its going to have problems. Some of the issues you point out are not correct. You can limit the amount of bandwidth each call consumes across your WAN link with CAC as well as the total bandwidth consumption for all calls and with some simple math ensure that your bandwidth is not consumed. RTAudio can be limited to ensure predictability by putting a limit on the bandwidth for each call using CAC. You can also redirect calls over the internet if you have edge servers with multiple edge deployments or over the PSTN if you have the suitable PSTN gateways in place in cases where CAC has deemed you have reached your limit.

      http://technet.microsoft.com/en-us/library/ff731056.aspx

      Agreed that if you wish to turn on QoS DSCP marking on every server and PC there is some work to do. Why its not on by default is because most companies do not trust PC’s to mark packets at the network level anyway by default so there is not much point in turning it on everywhere when DSCP markings are going to reset anyway. Most companies I see are marking at network ingress points based on ports which can be defined in Lync and with GP for PC’s rather than allowing endpoints to mark in the case of PC devices (Windows and MAC’s). This can be done for desktop sharing, voice and video so traffic can be matched to your QoS policies and when used in conjunction with CAC make for a very effective way to control bandwidth consumption.

      Delete
    3. Chris, thanks for your response. I have read up on CAC and QoS in Lync again after reading your post. I have learned a few more things about it. It does look like you can set some of these things up, but it is still quite complex compared to other available solutions.

      You still end up sending traffic across a WAN to try and setup a call even if the CAC will deny the call.

      If you take the approach of re-tagging traffic based on ports you quickly get into complex QoS policy in your networking gear. With my Cisco system I can have QoS running with a couple of simple commands and I am done.

      Another issue in my case is we have separate Lync environments on either side of the WAN. Since they are not aware of each other I would have to provision twice as much bandwidth to handle the calls or limit the number of calls by half. Granted, if you scaled a single Lync environment out across the WAN that would help, but we are a very large global enterprise and we just don't work that way.

      Another thing I have been reading the internet redirection does not work very well if at all. I will admit I personally have not attempted it, but there are a lot of issues with this based on the forums. Many just give up trying to make it work. I am not saying it can't work, but it is clear that it is troublesome to make it work.

      Delete
    4. I am not sure what you mean by sending traffic across the WAN. Yes the Lync client will check with the Policy Service to ensure that it can establish the call but this is not different than any other solution. At some stage the client has to query the server to see whats happening on the system.

      The statement you make about retagging traffic is fast becoming the norm. Sure you don’t need to do this if all you have is Cisco desk phones or for that matter any vendors desktop phones but what about soft clients. When I say soft clients I refer to the full eco system of different OS’s both desktop and mobile. I think that without port identification and tagging your not going to have a complete QoS architecture and be limited to having QoS only for Cisco desktop phones. Last time I checked iPhones and Android didn’t do CDP to identify a voice VLAN.

      As for your separate Lync environment you refer to I have no idea how your setup and unless both deployments reside under separate AD forests they will communicate with each other for CAC.

      Take what you will from forums but without trying it your self it’s a poor way to judge a products capabilities.

      Delete
    5. Chris,
      Thanks for responding. I am familiar with your blog. I have read my fair share of it. Nice job by the way! I will try to address each of your points below.

      The traffic across the WAN I am mentioning is related to the fact that the far end client has to make the decision on the CAC. The near end client sends traffic to the far end client to set up a call. The far end client checks with its policy server. Let's say the call is going to be rejected. Then the far end client sends traffic back across the WAN to inform the near end client that the call cannot be completed. If your WAN connection is already congested that seems to be a lot of unnecessary chatter on the line.

      Not all solutions work that way. I will speak on Cisco because that seems to be where the conversation is. With a Cisco implementation you would not send traffic across the WAN to check the CAC policy. It would happen on the near end in a properly designed network.

      In regards to re-tagging traffic. CUCILync does run a CDP driver so it works as simply as a regular desktop phone. The other softclients such as Jabber for iPhone an Driod do not run CDP so you are correct that the config would be different.

      Let's take an example of Jabber for iPhone. In the case that you are using Jabber on your iPhone on your WiFi network you would have a few different options depending on what tools are available. If you are running Cisco wireless with or without a controller you can change one command on your switchport. In the case of a controller this would be your trunk port going to the controller. If you are using autonomous APs you would add this command to each port uplinking to an AP. All you have to do is add "auto qos voip cisco-softphone" to the interface and you are done.

      There are other products that bring a different approach that can be quite flexible. If you have an Aruba wireless controller it does packet inspection. This allows the controller to inspect the traffic and apply QoS policies based on context. Its really very cool and extremely flexible. You could even have an executives group that gets higher DSCP markings that everyone else if you wanted to. Of course the Aruba controller would work with pretty much any voice solution.

      In our case we do have seperate Lync environments in seperate forests.

      We have done a Lync pilot with voice. We even sent myself and another guy to Lync training to make sure it got a fair shake. Prior to Lync I had done a couple of OCS implementations, but they did not include voice. I also had many years in a Cisco voice environment.

      I really did give Lync a fair shake. In the beginning I thought I would end up with Lync, but after working with it and studying more on how it works under the hood I personally feel that it is not as robust as a Cisco voice solution and I got Cisco cheaper than Lync. I really like Lync for IM, presence, and small conferences.

      Cisco has been in the voice game for a long time and that experience shows up in their solution.

      When looking at voice solutions I also looked at Avaya, Shoretel, and Interactive Intelligence. I have worked with all of these in the past. Shoretel is not bad for smaller guys. Its actually pretty simple to run. Avaya is definitely a player, but if you have a Cisco infrastructure its much easier to work with Cisco and you end up spending more on Avaya typically. Interactive Intelligence will actually stay alongside our Cisco solution for our call center. They have a great product for call centers, but their support model and architecture is tough to deal with.

      Delete
  3. Dear Adam
    Your blog is really great as it gives me the insight of Microsoft Lync 2010. Currently we are analyzing a Unified Communication solution for our Organization and Lync is one of them and after reading this article i have got almost all the information

    ReplyDelete
    Replies
    1. I am glad you found it useful and thanks for taking the time to comment.

      Delete
  4. Eye opener for me :). Adam could you please compare Lync with cisco solution.

    ReplyDelete
    Replies
    1. What kind of comparison are you looking for? What features etc.

      Delete
  5. Adam, do you know if Lync is installed in banking organisations? How would you think to manage the guarantee of the full system if the solution is composed by a mix of components from different vendors?

    ReplyDelete
    Replies
    1. I don't see any reason you couldn't use it in the banking industry. It encrypts traffic by default. I personally would not use it for voice in any industry, but for IM, presence, and internal conferencing Lync is pretty decent.

      I'm not sure what you mean by the full system and different vendors. Do you mean different vendors for Lync gateways, phones etc?

      You will certainly have a lot of "hands in the pot" when it comes to Lync support. Microsoft does not sell the hardware to make Lync work as a phone system so you end up with multiple vendors and/or support contracts. In contrast, Cisco provides an end to end solution with one point of contact. They also recently announced that their Jabber software will be included with their Communications Manager. That definitely makes the Cisco solution even more compelling.

      I think the best way to combat multiple point support when it comes to Lync is by partnering with a local VAR and buy support through them. That way you can hand off that headache to someone else. Of course, you will likely still need hardware support on your gear and the support services from the VAR will be an additional cost.

      When you say full system, do you mean voice as well?

      With any VoIP solution you will need to architect a solution based on your design goals. You will need to decide if you need High Availability, Fault Tolerance, Budget, Feature set etc. There is a lot more information needed to have an accurate answer to your question.

      Thanks for taking the time to read my blog and add to the conversation.

      Delete
    2. Thanks for your reply!
      When I say full system I mean a full solution which includes ToIP.
      I see a very complex solution considering the wide variety of vendors.
      Besides, the MS's experience in ToIP is less than 2 years compared with traditional vendors who provide a solution end to end since tens of years.
      I need HA, fault tolerance, five nines, local survival in branches, standards compliance, third level support, easy management of licensing and upgrading, leading solution in ToIP.

      Delete
    3. Well I can speak from experience that Cisco will give you all of the items you are asking about. They are the only vendor I am aware of that really does have an end to end solution. You can get everything from the voice gateways to the desk phones from them.

      If you go with their CUWL licensing model it is very simple to figure out.

      Cisco has definitely been in the ToIP game for a long time and their solution is rock solid and scales out easily.

      Not to mention if you go with Jabber it integrates with WebEx out of the box. Our problem with Lync conferencing for large external conferences is available lines to dial in on. To truly get an enterprise level conference experience with PSTN connectivity you either need a lot of lines coming in or you need to utilize an outside service. The cool thing about Jabber is you get the use of an outside service and the user experience like it is in house.

      Delete
    4. The world's largest bank (HSBC) uses Lync. They have ~ 400,000 employees.

      Delete
  6. This is kind of old, but I'd like to comment that Adam's comments on Lync in the FinServ industry (or for that matter Lync at large) are not representative of what the industry actually does. I believe Adam plain does not understand Lync enough (or has drunk too much Cisco koolaid) to serve as a reliable reference on Lync. Lync may or may not be the solution for any specific customer, but there is absolutely no doubt that it can be deployed successfully as a voice solution, including at FinServ institutions. The arguments on CAC, QoS, codec and such are misguided and show misunderstanding and/or thinking stuck in the 20th century. Welcome to modernity...

    ReplyDelete
    Replies
    1. Thank you for taking the time to post a comment on my blog. I appreciate you reading my article.

      My article was written from the perspective of a guy who had to look at the available solutions and decide on a strategy. I have worked with Cisco for a long time, but I have also worked with OCS since it was released as well. I have installed several OCS/Lync environments and my comments and information are written from my experience with this technology.

      In my article, I never said that Lync would not work. I did express my opinion on why I believe Lync has some inherent design flaws and I backed up my opinions with fact.

      My arguments on CAC, QoS, and codec is taken straight from TechNet on how it works. I never said it won't work. It does however, work the way I described.

      In the end my experience has taught me that the Cisco VoIP architecture and the end to end support is much easier to work with than Lync. The technology has a longer track record than Lync and you can build an architecture with global support without finding different vendors.

      After implementing both Lync and Cisco environments. In my opinion, Cisco is easier to implement and manage than Lync. It is also easier to troubleshoot and more reliable than Lync.

      Just look past the Microsoft marketing and study how it works under the hood. I love new technologies and I have a passion for IT. If it is better I will jump all over it. I like Lync for IM, Presence, and conferencing. Lync for voice is poorly designed.

      Who knows, maybe they will get better with future releases.

      Delete
    2. Hi,

      Nice that you took the time to review Lync and express your understanding of it in your blog. but I would like to inform you that your understanding of CAC, QoS, management of WAN Bandwidth and the complexities of these aforementioned subjects is clearly lacking.

      I would suggest you do some more research on CAC and QoS and how you can use them to manage your WAN bandwidth.

      Oh and by the way, if your WAN link can't carry the signalling traffic when a user tries to establish a call with a user in another site then I would suggest you need to go back to the drawing board for sizing your WAN links appropriately.

      Cheers.....

      Delete
    3. Thanks for taking the time to read and respond. With respect, all the info in this entry is factual and taken from TechNet.

      I think you are missing some of what I am saying. The WAN needs to be managed by the network gear. There is more going over the pipe than Lync, but Lync is only aware of itself. I can set aside bandwidth to always keep it available for Lync and make it work that way, but then I am wasting it when Lync does not need it.

      I can build QoS policies and make it work, no disagreement there. However, building QoS in Lync is a pain with all the port ranges etc. I can do the same thing with Cisco using only a couple of commands.

      I never said Lync will not work, nor did I say you can't tune it to make it work. However, it is true that it takes a lot more work to make Lync function properly than it does a lot of other vendors, including Cisco.

      The example of the signaling traffic was to make a point that when a congested connection is going to deny a call, Lync still forces that traffic across even though it will be denied. I realize that signaling traffic is a very small amount of traffic, but it doesn't change the fact that Lync is still pushing unnecessary traffic across a congested pipe. No matter how you look at it, that is simply not an efficient design.

      Delete
  7. I loved your blog, because you tried to so much to explain why you love Cisco and what makes Cisco for you. That's great. I am a fanatic of Lync, I have also been using Lync for a while. My company uses both Cisco and Lync. But I think this debate can be likened to the Windows vs Linux debate. That means, this simply boils to down to taste. We could line up positives and negatives for Lync and cisco. In the end few people will be unchanged from their original view, except where the person is completely new, then they may get swayed. I have seen some of users fighting tooth and nail to get Cisco voice while some also doing the same for Lync.

    In the end my conclusion is, it just depends on taste. All these issues on how easy it is to do abc's of Lync and Cisco or design of voice handling features, they just vary by person.

    ReplyDelete
    Replies
    1. Herbert,
      Thanks for taking the time to respond. I agree there and pluses and minuses to both Cisco and Lync. There are things I like about both of them. They both work. I enjoy all the feedback from readers and I am glad we can both express our opinions on the technologies we work with everyday.

      I also have a bit of a hybrid system between Cisco and Lync. I really like Lync for their IM, conferencing, and presence. For voice I still go with Cisco every time. I have a single vendor to contact for end to end support and I find the configuration to be much more efficient.

      At the end of the day both vendors will always have a user base because we all have different preferences. Different strokes for different folks!

      Delete
  8. Hi,

    Please provide me the sample QOS policy on cisco for lync. I need to know what ports need to be opened

    ReplyDelete
    Replies
    1. That is a bit of a loaded question. These policies can be different for everyone. I have included a link to technet that describes all the ports that Lync uses by default.

      http://technet.microsoft.com/en-us/library/gg398833.aspx

      I think this will get you the info you need. Thanks for reading and taking the time to post.

      Delete
  9. Want to know how do you prove that in congestion that Lync will send double packets ? Did you tested in lab ? Please suggest a scenario so i can reproduce

    ReplyDelete
    Replies
    1. Thanks for reading and posting.

      This link will give you a goof overview of CAC:

      http://technet.microsoft.com/en-us/library/gg398529.aspx

      You can see at the below link that the bandwidth goes up when latency and congestion go up.

      http://technet.microsoft.com/en-us/library/gg413004(v=ocs.14).aspx

      In order to replicate it in a lab you would need to simulate a small uplink between two clients and overwhelm it.

      Capture the results with your favorite packet sniffer and monitoring software.

      Delete
    2. We have a setup in our lab, 2 endpoints calling each other peer2peer. Even if we choke the link (packets drop) we don't see calls traffic jumping. We don't have CAC enabled , do we need to enable FEC settings somewhere.

      Delete
    3. The FEC is enabled by the clients when they detect too much jitter or latency. You will need to cause these conditions in your simulated uplink. I would also put some music on for each side so that the call is generating audio traffic that needs to be sent. Here's a link to a snippet of a book that explains a little more.

      http://books.google.com/books?id=3UXbkg5IktwC&pg=PT1022&lpg=PT1022&dq=enable+lync+fec&source=bl&ots=IG7d_3S_OH&sig=rWOsDt18FjaAZPzuirIfcrAtWOA&hl=en&sa=X&ei=YOU4UeanNITl0QH69oDIAg&ved=0CF4Q6AEwBQ

      Delete
  10. Interesting, we couldn't reproduce it even we congest the link and it drops the call but the call is 50K constant then drop. Using the traffic generator from a source to another interface using UDP traffic.

    ReplyDelete
    Replies
    1. It will take around 15 seconds for FEC to kick in. I would focus on introducing latency and jitter. Not just congestion. I may have found some software that will help you. I put the link below. I have not used this software, but it is made to help with experiments like you are working on. Let me know how it goes.

      http://wanem.sourceforge.net/

      Delete
    2. This link may also be helpful:

      http://blog.mrpol.nl/2010/01/14/network-emulator-toolkit/

      Delete
  11. Will try it. You're sure that we don't need to enable FEC on the Lync client through any CLI ? since i read in some docs that its optional

    ReplyDelete
    Replies
    1. I could be mistaken, but it is my understanding that FEC is part of the codec and it is enabled dynamically when packet loss is detected. If you do find that there is a way to enable/disable FEC please post back some documentation on it. I would be very interested in reading about it.

      Delete
  12. Any follow up since release of Lync 2013 or in case usage of Office 365/Microsoft hosted servers?

    ReplyDelete
    Replies
    1. Unfortunately I have not had the chance to work with 2013 or office 365 yet. If I do I will post an update.

      Thanks for taking the time to read and comment.

      Delete