Rise and move, divide and conquer. A Guide to Object-to-Object Migration from SAP to Snowflake
Data warehouse migrations are colossal undertakings that should not be taken lightly. But if you're reading this article, it means your company has weighed its options and decided to abandon the old on-premises architecture in favor of a future-proof, scalable cloud solution. I praise you.
However, years of reliance on the SAP architecture and the chaos of proprietary object types can discourage even the most ambitious executives and make the daunting task of migrating to the cloud seem more daunting than it is.
This article covers translating key SAP BW and HANA components into equivalent Snowflake objects. We'll also cover specific tools and strategies that can help accelerate the migration while ensuring end-user approval and adoption.
Dividing a big problem into manageable parts is an age-old combat strategy (and design). The expression can be traced back to Philip II of Macedon (divide and conquer) and probably even more.
Dividing a migration into manageable chunks allows for better estimation, task dependencies and parallel workflows. But what is a “maneuverable” part? Before we can determine work units, we need to understand the building blocks.
If I can determine the enemy's dispositions and at the same time hide mine, then I can concentrate and he has to share.
-Sun Tzu
To divide
SAP projects are notoriously complex. Similar objects can be configured in drastically different configurations, and different types of objects have overlapping similarities. Plus, there are more places to hide codes than a dime in a pair of cargo pants.
Fortunately, every SAP object has a corresponding Snowflake role. The degree of similarity varies by feature, but the migration can proceed without worrying about mismatch locks.
Here is an overview of the equivalence of SAP and Snowflake objects. Let's take a closer look at them below.
Infoobjetos
If an InfoObject does not contain any master data, it is simply aTo divideand does not need to be independently maintained in Snowflake. infoobjectscomMaster data ismesas. Master data attributes are also columns - they don't need to be maintained or declared separately.
DSO / ODS
Ralph Kimball gave us the concept of fact and dimension tables. SAP has gone a step further and turned them into system objects. The fact that an InfoObject typically contains dimension data and DSOs are used as fact tables does not change the fact that they are both databases.mesas.
To keep facts and dims separate, use naming conventions in your table names.
HANA Views and Composite Providers
Calculation views, attribute views, analysis views - allViews. Composite provider? You guessed it, views (with UNION). Writing views manually instead of fiddling with menu options on four different object types is infinitely faster and gives you more control over the output.
This cannot be underestimated.
ABAP routines and programs
They live among us and hide where you least expect it - it will be hell to find them. Converting to Snowflake objects doesn't have to be.
Routines are typically used to make small adjustments to column values, such as CASE or CONCAT. You can handle these operations directly in the INSERT or SELECT statement by rolling your own SQL. If you want reusable code snippets, use SnowflakeUDFs(User-defined functions.) UDFs can be written in SQL (thanks!), JavaScript, and Java.
Instead of ABAP programs that allow for procedural logic and database manipulation, Snowflake usesstored procedures. Procedures can be defined in JavaScript, SQL (Snowflake Scripting), Scala (Snowpark), and Java (Snowpark).
Unlike routines and programs, which can be scattered throughout the data pipeline—inside objects and behind menus—Snowflake functions and procedures are securable database objects like tables and views. They are arranged accordingly in a scheme.
Here is a brief comparison between UDFs and SPs. Detailed information can be found atSnowflake-Documentation.
Recording mode and CDC
Record mode values for changed records combined with multiple tables mapped to individual objects is SAP's answer to Change Data Capture (CDC). there is a snowflakestreams.
Streams are built on top of database objects and track changes to the underlying data. They provide a complete reference of which records were inserted, updated or deleted.
Streams, like tables, can be queried and used to load or update tables. Multiple streams can exist for a given object to feed different targets, and they can even be created through views!
advanceDocumentation in Snowflake.
process chains
Process chains are the DAGs of an SAP data warehouse that control the flow of data with schedules and dependencies. The snowflake equivalent isTasks. Tasks can be scheduled to run on a schedule or as subtasks of parent tasks to execute an SQL command or call a stored procedure.
Tasks can be chained together, creating dependencies and parallel execution, similar to a chain of processes. When combined with flows and stored procedures, tasks are a powerful tool for running continuous ELT workflows without investing in a third-party orchestration tool.
InfoCubos
Converting an InfoCube to a Snowflake object is the least straightforward of all the examples given. An InfoCube is a strictly defined multidimensional object with many configurable parameters and no direct equivalent in Snowflake.
Don't let this put you off.
Under the hood, an InfoCube can be summarized as an extended star schema - ironically, a sort of snowflake schema. A cube has a central fact table associated with (between 4 and 16) dimensions and "navigation attributes" which are just dimension attributes of those dimensions (and their extents). In addition, aggregates may be present.
Since there is no such thing in Snowflake, you have to work with existing objects to achieve star architecture. To get the performance and data freshness you want, you need to experiment with a combination of tables, views, and materializations of denormalized data.
For aggregations, Snowflake (Enterprise Edition) offers a similar feature: materialized views. Materialized views are a precomputed set of data derived from a query specification - like you would get with a CTAS. But unlike a CTAS, Snowflake monitors the base table for changes and automatically (and serverless) updates the materialized result.
Snowflake's service layer even redirects queries against the base table to a materialized view.
PSA tables
The Persistent Staging Area (PSA) is the input storage area for data from source systems in SAP BW. In Snowflake, this is called a phase and is built on top of existing cloud storage providers (currently AWS S3, Azure Containers and GCP buckets are supported).
To conquer
Now that we've seen what we're up against, let's create a plan of attack. After analyzing the components piece by piece, we can plan a strategy that takes into account the must-have features of the current solution, while also focusing on the strengths of the target platform.
Previous migration articles have covered concepts such as agile methodology, MVP products, and the importance of accessible design documents (links below). Now let's explore how to find a "champion" for our data crusade.
crown a champion
The data warehouse team will drive the migration. But sending them without a backup can be risky. This is where "champion" comes into play.
A champion is typically the head of a high-value department with an interest in data-driven decision making - a power user. Involving them will be a crucial ally in building and promoting the new solution as your success becomes their success.
Understanding your champion's data requirements and including them throughout development, but particularly in the early stages, will provide significant guidance on the features end users want.
Don't get overwhelmed by the diverse needs of an entire organization by focusing on a single symbolic use case and tailoring the platform's features and capabilities to precise specifications.
This approach ensures that you'll have a loyal fan base and a top-notch advocate once the new platform launches. The champion can vouch for the new solution to the board and provide the rest of the organization with a blueprint for leveraging the data platform for maximum impact.
Features developed in collaboration with your champion already serve a real-world use case and are likely to scale to meet the needs of the rest of the organization.
But don't do like Oberyn Martell and celebrate before the end of the fight. You can find new champions and repeat the exercise to evolve the solution beyond the initial MVP into a mature data platform.
Training
Moving to a new and fundamentally different data platform means you need to educate and improve your developers. Snowflake is notoriously easy to adopt thanks to its rich feature set, ANSI compatibility and "almost zero maintenance".
This guide describes Snowflake alternatives to the most common SAP objects, but does not include Snowflake-unique objects such as pipes, tags, and masking policies. Taking these features into account during the migration planning phase improves project efficiency and saves costly rework later on. Investing in education and training will certainly pay off.
Fortunately, Snowflake offers a variety of training programs to help your business achieve its goals. This includes free self-paced courses and instructor-led training courses designed to meet the unique needs of each organization.
For those who are completely new to Snowflake, I recommend the 4-day basics course, which covers all the essentials of the platform.
Check out the full catalog of Snowflake coursesHere.
Leading a database migration under your own power does not mean doing it empty-handed. Whether specific to SAP migrations or generally applicable to an enterprise-wide project, the following resources exist to automate ancillary tasks and support your operations:
- Tools for documenting the divide-and-conquer exercise and creating workable designs for the development phase.
- Bring source data from SAP ECC and BW systems to the cloud.
- Automation of bulk conversion of HANA objects.
Documentation, planning and modeling
As mentioned earlier, a migration from SAP to Snowflake breaks an existing model down into its building blocks, translates those components into equivalent Snowflake functions, and assembles them into neat, time-bound results. This requires good documentation.
With the right tool, documentation can also serve as a blueprint for subsequent development – it becomes an evolving collaboration platform rather than an outdated throwaway product.
When it comes to Snowflake-specific modeling tools that also enable real-time collaboration and documentation, SqlDBM is the only modern cloud-based game in town. SqlDBM allows planning and conceptual designs to evolve with the project and generate fully qualified and deployable Snowflake DDL when the project moves into the physical modeling phase.
Additionally, blueprints and documentation remain up-to-date, synced, and searchable even after the platform launches. There are even advanced data governance features like object tagging and glossaries.
Review all the features that SqlDBM provides for your database before, during and after the migration.on your website.
data integration
Organizations running SAP BW or HANA warehouses likely also have SAP ECC data sources, which are notoriously difficult to extract. HVR (now part of Fivetran) is one of the few data integration tools on the market that specializes in extracting data from SAP ECC and HANA; it even supports protocol-based CDC deltas.
HVR ingests data from SAP cluster or pool tables and transparently loads it in regular rows and columns into Snowflake. The tool even validates and compares the extracted datasets. For more information on features and the complete list of supported connectors, seeHVR-Website.
Automated HANA migration
BladeBridge Converter is a code migration tool designed to batch refactor code to/from various data platforms including SAP HANA. Organizations with large HANA footprints can benefit from the automated template-based conversion that BladeBridge provides.
Although SAP BW is not yet officially supported as a source, some BW components can also be configured for Snowflake refactoring via BladeBridge templates. Learn more atBladeBridge-Website.
With the conclusion of this briefing, the operational instructions for performing a full-scale lift-and-shift migration from SAP to Snowflake should be clear. By taking what seems like a monumental effort and breaking it down into manageable chunks, your developers can accomplish this mammoth task by destroying them with concentrated attacks.
The links embedded in this article provide extensive additional support for training and equipping your developers. The following items provide in-depth reading on tactics and lessons learned from previous campaigns.
Think about your training and you will do well.
Now later!