Feeds:
Posts
Comments

Hi folks,

This time I have decided to share our best practices material for SharePoint. Yes, I know, some of you SharePoint savvies would question its meaning. as SharePoint is SOOO broad and infrastructure is just one pillar. TRUE! however its the foundation, and as such you have to invest there first before building your service consisting of web applications, sites, web parts, workflows, dashboards and…. you get the idea 🙂

So here’s a summary of the presentation attached, James Baldwin and myself presented back at EMC World 2011 in Vegas.

I’ve also added a lot of links and references for you to find the technical material necessary to accomplish plan and deployment activities.

Please feel free to comment as I would love to get your feedback as for what works/doesn’t work  in your environment.

Servers and Virtualization

Virtualizing SharePoint servers has the same benefits as any other application and/or database servers in your datacenter. but there’s actually more than “just”:

  • Consolidation – Achieve 2-10x consolidation ratio, especially for larger deployments
  • Performance – Improved front end performance with more, smaller WFEs rather than few large WFEs.
  • Maintenance – Live migration of virtual machines (VMware vMotion, Hyper-V Quick/Live Migration)
  • Load Balancing – Maximized overall performance with balanced HW utilization across the farm (VMware DRS, SCVMM PRO)

Aside from all those obvious benefits, there are couple that stand out in distributed configurations like SharePoint:

  • Availability – VM based protection for SharePoint provides HOMOGENOUS availability (VMware HA, WSFC)
  • Business Continuity – Simplified DR management (vCenter Site Recovery Manager, Cluster Enabler)

Virtualization has been fully supported by Microsoft since the launch of SVVP (Server Virtualization Validation Program) in 2008.

SharePoint is a perfect candidate for horizontal scaling. meaning scaling out server roles as more resources required. You would actually get some performance benefits when Web servers (front/back end) are broken into multiple instances by scaling out rather up, utilizing same hardware resources. In many other cases I find that working for the application and SQL server roles too.

Don’t get intimidated by the hypervisor overhead (<10%), again you can always distribute content databases using multiple SQL instances, Partition Index across multiple Crawl/Index servers. all servers roles in SharePoint 2010 can scale out!

Some would recommend to consider leaving index and/or SQL physical, we have proved so many times that ALL server roles can be virtualized without any problem while balancing processing (CPU) pressure  (I wouldn’t be that worried with I/O throughput) with multiple virtual machines. Microsoft general recommendation to dedicate at least 8 physical cores for a medium sized farm is very generic and I don’t see that as a showstopper. if you take that guidance literally I would suggest to wait for the next rev. of vSphere (very soon) and Hyper-V (later this/next year) which would present up to 32 vCPU support.

  • Plan for USER LOAD peaks and not for systematic peaks. from what we have observed in lab tests and actual customer data, SharePoint’s regular timer jobs are responsible for most of the I/O and CPU peaks.

    Virtual SharePoint farm - Reference Architecture

The configuration above supported more than 20,000 users with 10% concurrency using only three Dell R910 ESX cluster.

Storage Planning

Of course you would have to plan for performance and not capacity in most cases; definitely for SharePoint farms in a production environments. before getting into details, here’s a good picture of where SharePoint data is located:

For a complete list of all databases installed with SharePoint goto Database types and descriptions (SharePoint Server 2010)

Performance

To point the finger the the “hottest” areas in terms of I/Os I would rank it in the following order of importance (IOPS and latency):

  1. Search databases (Crawl and Property) data and log files
  2. Query Server/s – Query component/s
  3. tempdb data and log files
  4. databases logs

From our lab tests this is what we have found in terms of I/O sizes and R/W ratios. as you can understand this is not your typical OLTP or OLAP profile, SharePoint workload can vary but here’s some data you might found useful when planning storage resources:

Microsoft suggests really high I/Os derived from the search components of the farm. while definitely true, I would argue the suggested IOPS requirements apply to most environments. here’s what we have observed vs. TechNet recommendations:

Capacity

For sizing, look no further using the following TechNet articles:

Here’s a summary with our recommendations:

SQL Configuration

  • Use 64KB unit allocation size (cluster) when formatting a DB Volume (MSDN)
    • Plan Database file sizes accordingly
      • Don’t rely on autogrowth – File growth can cause locking and has some performance implications. set files size and autogrowth increments appropriately
    • When using  Thin/Virtual provisioning
      • Use the “Quick Format” option
      • Enable Instant file initialization as it enhances the speed for data file creations, restores, data file growth
      • Assign SQL service account  to “Perform Volume Maintenance Tasks” permission
      • Just bear in mind that log files (ldf) are fully allocated and zeroed upon creation or expansion
    • Standard storage response time guidelinesapply, On well-tuned storage system, ideal values would be:
        • 1–5 ms for Log
        • 4–20 ms for Data on OLTP systems (ideally 10 ms or less)
        • 30 ms or less on DSS (decision support system) type

FAST VP

In general, provides great value but for maximum efficiency, it depends on which storage role:

  • Search Index component – No (Highly changing, throw-away data)
  • Search Query component –  Yes (Highly-read data with small burst write changes)
  • TempDB – Yes (The same blocks are re-used on disk and performance of TEMPDB directly affects SharePoint performance request  – tempdb is used in every SharePoint request)
  • Content databases/BLOB Store – Maybe (Depending on the diversity of workload. If some site collections tend to be busier than others)

CX/VNX Thin Provisioning and LUN Compression

BLOB Storage (RBS)

We work with Metalogix StoragePoint for BLOB externalization. currently all EMC storage solutions are supported with StoragePoint as it has connectors to file, block and object storage. some of the possible BLOB stores:

  • Symmetrix VMAX – Block
  • VNX – Block and/or File (NFS/CIFS)
  • Atmos/VE – object (REST)
  • Centera – Centera API
  • Isilon – File (CIFS/NFS)
  • Data Domain – File (CIFS/NFS)

While each would make sense to our customers depending on their general storage design and requirements, please consider the following guidelines:

  • Latency – TTFB (Time to First Byte) should be less than 20 ms
  • Recommended maximum content database size remains 200 GB (guidance might change by MS)
  • SQL RBS FILESTREAM provider works, but it doesn’t scale as other RBS providers like StoragePoint
  • Performance improvement would be more salient when
    • Externalizing larger objects (>1MB)
    • Read-intensive access
  • File size limit remains 2GB even with RBS
  • Backup and Replication considerations:
    • Native/Item level backup (stsadm based) would include BLOBs
    • SQL based backup would only protect the content database metadata
      • To maintain consistency:
        • Backup – First Content Databases then BLOB Store
        • Restore – First BLOB Store then Content Databases
    • For DR purposes always tie RBS volumes with SQL Server volumes
    • For faster recovery, consider larger intervals of garbage collection jobs (Keeps previous BLOB versions)
    • Here’s a reference architecturebased on:
      • EMC VNX5300, SQL Database: 15K SAS, BLOB Store: 7.2K NL-SAS (CIFS Share for RBS)
      • Max user capacity – 8,630 (10%)
      • BLOBs consumed 92% of content databases
      • Full crawl duration – 34 hours (4.4 documents)

Disaster Recovery

While Microsoft has some guidance for SharePoint availability as covered in Plan for availability (SharePoint Server 2010), there’s a lot more involved to obtain a true and complete SharePoint availability across multiple sites. Storage based replication can accelerate the failover process and can scale as your farm sotrage needs grow. The most basic methods of availability can be achieved with SQL server log shipping and/or database mirroring, but while effective for smaller configurations they still lack the complete farm protection. the only components that can be continuously protected with db mirroring/log shipping are SQL databases, and not all of them! what about the index? WFE? app servers?

SharePoint DR involves a lot, but can be significantly simplified when virtualizing all server roles, thus providing end-to-end mobility of SharePoint farm services without worrying about the BLOB filesystem, individual databases, index partitions etc. When choosing storage based replication, the first thing to consider is leveraging consistency grouping all SharePoint volumes. This is of a great value as it can guarantee an end-to-end (I like that term 🙂 ) consistency at any point in time.

Here’s a table to help you understand what is related to what and how to go about consistency grouping available with almost all EMC replication solutions (SRDF, RecoverPoint, MirrorView etc.):While this is a great solution, that type of DR still involves manual failover, restart and configuration. that’s the reason why I would be always recommending virtualizing all server roles, that would enable you to leverage automation solutions for virtual infrastracture. Namely, vCenter Site Recovery Manager (SRM) or Multi-Site Hyper-V clustering enabled by EMC Cluster Enabler (SRDF, RecoverPoint, MirrorView). Assuming VPLEX is deployed, you won’t even need Cluster Enabler but just rely on Windows Server Failover Cluster (VMs) to achieve that.

There are several reference architectures we have successfully tested and published available on emc.com:

I’ll keep updating that post based on feedback, updated findings and updated guidance from our friends at Microsoft.

Enjoy,

E!

Hello again my virtual friends.

Now when EMC world is behind I found the time for some tech updates. This time, is about another solution we have tested leveraging EMC’s VPLEX Geo.

VPLEX is a solution for federating EMC/Others storage. it sits between the servers and your storage adding a virtualization layer. but it can do much more than that as it presents a sophisticated SDRAM cache which can be distributed to a remote site while maintaining coherence. EMC VPLEX family consists of three viable configurations/offerings:

  • VPLEX Local – For managing data mobility and access within the walls of your data center using a single VPLEX cluster
  • VPLEX Metro – For  mobility across two sites separated by an inter-site latency of up to 5 ms (roundtrip). We have tested that solution last year; vMotion over distance for Microsoft, Oracle and SAP.
  • VPLEX Geo – For access between two sites over extended asynchronous distances with up to 50 ms latency (RTT).

Couple of months ago the virtualization team in Hopkinton worked on testing application mobility on VPLEX Geo this time with Microsoft Hyper-V clustering (sounds strange?! yes, we work with both VMware and Hyper-V in our labs but don’t expect us to add a NetApp FAS or anything crazy like that 🙂

Its an interesting whitepaper that covers SAP, Oracle and SharePoint mobility with heterogeneous VMAX and VNX configuration. the wan link was optimized using EMC’s Select parnter Silver Peak  but let me highlight the SharePoint part in that solution. for further reading please download Long distance application mobility – Enabled by VPLEX Geo.

Physical Architecture Diagram

The SharePoint farm had few site collecions and a total of 400GB of user content. a total of seven VMs constituted the server farm – 3 WFE, 2 Index/Crawl, 1 App and 1 SQL. the configuration supported more than 12,000 users with 10% concurrency with a sub-3sec user response time for all operations (browse, search, modify). Hyper-V clustering was configured using CSVs (cluster shared volumes).

The highlight of the test was migrating the ENTIRE farm from site A to site B with simulated distance of 2,000 km (~1200 miles) WITHOUT a disruption of service using Live migration. while migration took place, the farm’s user load capacity somewhat degraded but still was able sustain more than 9,000 users (-23% load).

Live migration times- VMs breakdown

Using Silver Peak WAN optimization in that solution resulted in almost 70% reduction of data transferred between the two sites. impressive!

To summarize, you have a virtual infrastructure stretched across 2 sites with ZERO downtime for migration operation and very low downtime for outage/disaster scenarios while the server/hypervisor infrastructure addresses the storage element as a SINGLE entity.

Stay tuned… best practices for SharePoint is coming in my next post.

E!

Hello everybody,

I have to apologize again for going down under, I probably don’t have the dual talent as some of my peers at EMC; blogging and working at the same time…..

I have a good idea for a startup 🙂 Run a keylogger type program to constantly run on your laptop/pc/tablet and capture important achievements, interesting topics, emails etc and automatically publish those as blog posts and tweeter feeds! who wants to pick it up? I guess WikiLeaks already did that to some extent…

Today, I wanted to share some recent SharePoint 2010 infrastructure testing done by our Shanghai team and led by Frances Hu. We call those “Solutions” as they are real SOLUTIONS and not just a couple of products proven to work on our storage arrays. and before I get to the details here’s a short checklist of what’s included in that solution:

  • Server virtualization – Yes, VMware 4.1
  • High availability – Sure, VMware HA cluster based
  • Storage tiering – Yes, EMC CX4 (we didnt get the VNXs on time) with Flash (FAST Cache), FC (System, RecoverPoint) and SATA (everything else)
  • Disaster Recovery – Oh yes, vCenter SRM using RecoverPoint/SE
  • Remote BLOB storage – Of course, that’s actually part of tiering, we used the native SQL RBS FILESTREAM for that.
  • Efficiency – Yes, the BLOB store was compressed using LUN compression
  • Backup/Replication – Check, we used Replication Manager and Metalogix Selective Restore for item level recovery.

All that in one solution which we are going to demo at EMC World 2011 (I hope you’re coming to Vegas)…

Some of the key results/figures:

  • The SharePoint farm in the solution (including the SQL server) was virtualized depicting a highly available midsize SharePoint environment
  • The sustained simulated maximum user capacity was 13,080 at 10% concurrency
  • Search crawl performance was improved by 91% and search response time was reduced by 27% when using EMC FAST Cache on those LUNs
  • 30% of disk space savings by using EMC LUN compression features on the SharePoint server BLOB store LUNs.
  • 91.2% of SQL database data file storage space was freed after enabling SQL RBS FileStream
  • A full-site disaster caused only 15 minutes of downtime while all farm’s VMs failed over to DR using SRM and RecvoerPoint
  • Using Replication Manager 5.3.1, it took only 6 minutes to restore a 100 GB content database from a SnapView replica

Storage Layout reasoning is based mainly on cost and somewhat controversial but I like it (mostly SATA!!!) as it proves to work!

Here’s the environment architecture

So yes, this is common in our labs, only 3-4 physical servers to support tens of thousands of users.

For more details you can download the whitepaper from: http://www.emc.com/collateral/software/white-papers/h8139-protection-virtualized-sharepoint-wp.pdf

If you have any questions let me know…

Until next time (which I believe would be dedicated to SharePoint DR discussion).

Happy Passover/April vacation.

Eyal

Hi again,

In an older post, I was sharing key data points from our SharePoint EBS externalization solution based on Metlogix StoragePoint v2 with MOSS 2007 to an EMC Atmos cloud storage, the whitepaper is available on EMC website.

This time around I wanted to share a more recent use case delivered by our Unfied Solutions Engineering group based in RTP, NC. They have put an EMC Unified NS-480 storage to the test with various BLOB storage configurations using StorgePoint v3 with Microsoft SharePoint 2010. The solution used three storage configurations: 1. No externalization 2. RBS on FC disks  3. RBS on SATA. Also used point-in-time snapshots managed by Replication Manager to protect the entire environment (BLOB store, SQL metadata, configuration, search etc.)

The graph below displays the comparison of the three configurations in terms of overall throughput measured in Requests Per Second (RPS).

From the results we conclude that when the baseline configuration was compared with BLOB externalization on FC disks, there was an overall increase in the SharePoint farm throughput by roughly 50% for the three different user profile mix tests. This proves that moving BLOBS from SQL content databases to a dedicated file system created on FC disks increases the I/O performance of the files and the overall throughput of the SharePoint farm.
When the baseline configuration was compared with the BLOB externalization on SATA disks (which are slower than FC disks but of higher capacity), there was ~10% increase in the throughput for 80/10/10 and 70/05/25 profiles and an 8.5 percent decrease in the throughput.

Now I have to admit, mileage may vary, depending on the object count, size and user activity but the general idea is that the larger the BLOB the more reasonable is it to externalize it with a very good chance for performance improvement but no guarantees there!

May I also mention the CPU overhead involved with externalization. And if you haven’t figured it out yet, SharePoint in general really likes CPUs……

The CPU utilization of the Web and Application servers remained relatively constant with or without BLOB externalization. However, SQL server CPU utilization was higher with BLOB externalization enabled.

For more details you can actually download the document directly from EMC website.

Questions, comments are welcomed

Happy holidays,

E!

Hi all,

I know I’ve been quiet for a while, but hey, there’s just a lot to do at EMC aside from blogging 🙂
Anyway, just returned from a very short visit to the gambling capital, Vegas. I was presenting EMC solutions for Microsoft SharePoint infrastructure. The room was packed, and this is a good sign!
There were two other conferences taking place at the same time in Vegas, one was the Pool and SPA which seems to be very interesting but less glamorous than the annual SEMA conference. but hey, it all doesn’t compare to SharePoint connections 🙂

In the 75 minutes slot I had, I was talking about how SharePoint 2010 is now even more demanding and more scalable than before. The aspects and considerations for virtualizing SharePoint server roles. How Storage Virtual Provisioning (VP) makes sense. Tiering in SharePoint using EBS and/or RBS with EMC storage, also discussed some of the best practices for SharePoint capacity and performance planning. And finally touched some of our capabilities for SharePoint farm availability(yes, lots of virtualization involved to achieve that!) and protection (Backup and DR).

It was quite a lot to cover in one presentation but I tried to address the most current capabilities EMC brings to the table, especially with all of our integration testing at Proven Solutions group.

I just thought you might find this material useful, so I have attached the PowerPoint deck. if you have any questions, feel free to contact me.

Thanks,

Eyal

SharePoint_Connections_Final(pdf_friendly)

Hey,

Just wanted to give you all a quick update on some recent work the solution teams did with regards to SharePoint storage tiering.

SharePoint, as you know (or not?) stores all the content it manages in SQL Server tables. while there is some merit in doing so, it’s mostly disadvantageous when considering larger deployments of SharePoint as a true framework for content collaboration.  Those unstructured binary objects once stored in SQL are called BLOBs (Binary Large Objects). As the size of the content databases keeps growing, the main contributor is the BLOB data, which grows significantly faster than any associated metadata; a BLOB would usually comprise ~95% of the content database size.

In order to support the “ECM for the masses” message, Microsoft introduced a couple of APIs to accomplish that externalization task. The first is EBS (External BLOB Storage) that is available since MOSS 2007 SP1 and recently RBS (Remote BLOB Storage) which is available in several flavors for SharePoint 2010.

In this post I’m going to highlight the recent integration work we have accomplished with a MetaLogix product called StoragePoint.

The solutions team in Santa Clara worked on a neat Cloud storage solution for SharePoint BLOBs based on EMC Atmos.

The 3TB SharePoint farm content was externalized to EMC Atmos which dramatically decreased the size of the content databases and  demonstrated MetaLogix’s StoragePoint capability to effectively manage the externalization of SharePoint BLOBs to EMC Atmos through it’s Atmos connector. oh and BTW, that farm was 100% Hyper-V.

Atmos on-premise testing shows that the performance is nearly identical to the traditional setup of SharePoint with SQL as indicated in the following table. Results indicate that relocating BLOBs to an external BLOB store (EBS) shows no impact to the overall user experience across the three user profiles simulated:

 

The Unified Midrange Storage Group (UMSG) in RTP,NC conducted similar tests on EMC Celerra NS-120 in a VMware vSphere virtualized farm.

Various BLOB store flavors were tested in that case, involving EBS BLOB store provisined by FC drives with and w/o Deduplication as well as SATA drives with and w/o Deduplication, all through a CIFS share.

The following figure shows the disk layout of the storage design:

Some highlights from that test:

  • SQL disk usage reports after BLOB externalization showed an 88% reduction in size of the content databases.
  • After deduplication was completed, 18% of the file system space was saved.
  • While externalizing BLOBs presents overall performance improvement in retrieving objects it may add some latency to search and modification activities which in most deployments would represent a smaller precentage than browsing content (Another factor is the size of the BLOBs externalized, the larger the object the more efficient EBS/RBS is).
  • Once content is externalized, SharePoint indexing gets a boost. full crawl activity finished in less than 1/3 of the original configuration. I believe it has to do with the nature of indexing which is sequential.

While these two solutions are based on SharePoint 2007, we plan to re-validate it soon on SharePoint 2010, but don’t expect any magic there. I suspect results would be similar.

I believe that BLOB extenalization is the catalyst to SharePoint adoption in larger organizations, leveraging  SharePoint ECM capabilities in the  Multi-TB club. EMC has a wide range of offerings in that aspect and these two solutions demonstrate only part of it.

In this round the focus is on the most recent release, Microsoft SharePoint Server 2010.

The team in Cork, Ireland led by Alex Bowers has tested a fully virtualized SharePoint 2010 in the lab for the last couple of months.

Here’s the diagram of the architecture validated, the whitepaper would be available on powerlink.emc.com in the next couple of weeks.

The disk layout:This solution illustrates the power of combining EMC CLARiiON CX4 storage array with the virtualization capabilities of VMware vSphere.  The solution Provides a consolidated, well-performing server and storage platform for large federated SharePoint environment while minimizing TCO ( 3 physical servers supporting more than 20,000 users with 10% concurrency!!!) at all component levels – servers, infrastructure connectivity, and storage.

The main message here folks is that SharePoint can be virtualized and we proved it numerous times, not only the web tier but search, application and SQL (yes SQL!) can be confidently virtualized with minimal overhead (<=10%).

In terms of storage power, we have plenty there, and in that paper amongst others we provide some best practices and guidance regarding the storage resources to be allocated to a demanding environment as this one.