Shopify Sidekick works reliably for creating flows. For all other tested areas such as theme customizations, data maintenance, and KPI analyses, it is too unreliable for productive use.
AI is making its way into more and more areas of everyday Shopify life. With Sidekick, Shopify is actively pushing this forward: the assistant integrated directly into the Admin is designed to speed up everyday tasks, from flow creation to theme customizations and data analysis.
For us at tante-e, the question was clear from the start: What can Sidekick really do, and where does it create more effort than it saves? We tested it in six typical use cases. The results are clear.
E-commerce specialist Lara supports numerous e-commerce brands with their shops and, as part of the tante-e AI Research Team, regularly tests new AI features around Shopify.
1. What is Shopify Sidekick?
Shopify Sidekick is an AI assistant integrated directly into the Shopify Admin. It is designed to understand user queries and access the data of the respective shop to provide context-related answers and actions.
With Sidekick, merchants should be able to complete tasks without having to click through menus.
It is thus aimed at both inexperienced merchants and experienced teams who want to accelerate routine tasks.

2. How we tested Shopify Sidekick: 6 Use Cases
We tested Shopify Sidekick over several weeks in six specific use cases: general Shopify questions, B2B customer creation, data maintenance, KPI analysis, flow creation, and Theme Editor adjustments.
The tests were conducted on real shop setups, including shops with custom functions, complex product structures, and individual metafield configurations. Each use case was evaluated based on three criteria:
- Time savings: Does Sidekick noticeably reduce the actual effort?
- Quality: Are the outputs correct and directly usable?
- Reliability: Does Sidekick deliver consistent results across various requests?
At the end of each use case, there is a rating on a scale of 1 to 10.
Use Case 1: General Shopify Questions
Shopify Sidekick is useful as an initial guide for standard Shopify questions, but only if sufficient prior knowledge is available to recognize misinformation.
What works:
- Navigation to native settings: Questions like "Where do I find the tax settings for my shop? Can you help me create a new tax class for my products?" or "How do I set up Shopify Markets?" are answered reliably by Sidekick. As a guide for merchants who are not yet well acquainted with the Admin, it is helpful here.
- Feature explanations: Sidekick formulated the explanation of UTM parameters for influencer links so clearly and understandably in one test that we could forward it directly to a client. This is one of the few situations where the output was usable without post-processing.
- CSV templates: For standard processes such as tax overrides via bulk upload, Sidekick can prepare useful CSV templates. This saves time on repetitive tasks.
- Simple code bugs: Sidekick can sometimes detect and fix minor errors in code snippets.
- Communicating limitations: Sidekick sometimes correctly refers to app developers if a question is too specific for its knowledge. This behavior is positive, but unfortunately not consistent.
What doesn't work:
- Tax questions: This is one of the most critical weaknesses in this use case. Answers to tax settings are wrong, contradictory, or without usable source references. For tax-relevant configurations, we do not currently recommend Sidekick.
- Shopify Markets: In the same conversation, Sidekick simultaneously claimed that a maximum of 50 Markets were possible in the Plus plan, and shortly thereafter, there was no limit. The sources given did not match the actual content.
- App recommendations: Sidekick recommends apps based on App Store ratings, not on actual shop requirements. However, a 5-star app is not automatically the right one for the respective application. The recommendations regularly deviate from what we as an agency would recommend in practice.
- Theme information: Sidekick does not know that certain themes, such as the Impact Theme, are paid, even though this information is publicly available in the Shopify Theme Store. It also cannot currently identify themes on external shops.
- Specific client implementations: Questions about loyalty programs, backend notifications, or custom logic remain without useful answers. Sidekick at best finds associated meta objects, but cannot explain how they work or how they should be populated.
One learning from the tests is particularly relevant for everyday agency work: The more specific the question, the higher the error rate. Sidekick is strongest with general questions (at the level of the Shopify Help Center) and becomes less reliable with increasing context. This makes it a tool that requires prior knowledge, not one that replaces it.
Rating: 4/10
Use Case 2: B2B Customer Creation & Custom Logic
For simple customer creation with standard fields, Sidekick is conditionally usable. However, as soon as individual discount logic, custom functions, or metafield configurations come into play, we cannot recommend it after our test.
What works:
- Basic customer creation with master data and standard tags works reliably.
- Initial suggestions for international collection pages that roughly match a stored knowledge base can serve as a starting point.
What doesn't work:
- Custom tags and discount functions: We asked Sidekick to link a new B2B customer with the correct discount logic. Sidekick set the tag "Apotheker" (pharmacist) instead of the correct tag "Pharmacy," which actually triggers the stored discount function. The result: The function does not apply, the discount is not used. Sidekick has no insight into the function logic running in the background.
- Metafield forms: Sidekick does not find the link to a shop-specific embedded form and is unable to populate metafields during customer creation as instructed. For shops that actively use metafields for customer segmentation or B2B processes, this is a clear exclusion criterion.
- Adjust menu links per market: We asked Sidekick over three iterations to set links in the navigation differently depending on the market. The answers were contradictory each time. In the end, Sidekick admitted that this type of link change is not possible at all. Information that should have been communicated clearly from the beginning.
- Native catalog data: Sidekick finds existing catalog data but cannot assess whether and how it is currently used in the shop.
The structural problem in this use case is particularly critical: Sidekick consistently responds confidently, even if the answer is wrong. Anyone who does not know the custom logic of their own shop precisely will not recognize the errors and, in the worst case, will implement faulty configurations live.
Rating: 2/10
Use Case 3: Data Maintenance & Bulk Upload
For creating individual customers, Sidekick is conditionally usable. For automated bulk uploads with more than a few entries, it is not suitable.
What works:
- Individual customer creation: Taking over master data from simple lists or CSV structures and creating individual customers works reliably. The process is intuitive, even without deep Admin knowledge.
- Validation of inputs: Sidekick recognizes invalid phone numbers, such as test numbers, and provides a corresponding hint. This is a small but useful detail.
What doesn't work:
- Stops after two customers: The most striking behavior in the test was consistent: Sidekick stops after creating exactly two customers. Each subsequent time, it has to be manually prompted to continue. Manual confirmation is also required for each customer, as Sidekick does not save automatically.
- Deleting customers: Sidekick cannot delete customers. It only provides navigation hints, some of which are incorrect.
- Metafields: Metafields cannot be populated via Sidekick. For shops that use metafields for customer segmentation, B2B processes, or individual data maintenance, this is a fundamental exclusion criterion.
- Real bulk upload: For more than ten data records, direct CSV import in the Shopify Admin is significantly more efficient, faster, and less error-prone than using Sidekick.
The fundamental problem of this use case is structural: Sidekick is not designed as a bulk tool. It works sequentially, requires manual confirmations, and does not support extended fields. Our assessment: Anyone who tries this method anyway will invest more time than they save.
Rating: 2/10
Use Case 4: KPI Analysis & Data Evaluation
Sidekick can reliably read out simple key figures. As soon as shop-specific context, complex product structures, or period comparisons come into play, the results in our tests became unusable.
What works:
- Simple key figures: Conversion Rate, Average Order Value, and similar basic key figures are read correctly by Sidekick from Shopify Analytics.
- Generate analysis links: In one test, Sidekick created a complete analysis overview with a directly callable link to Shopify Analytics. For merchants without in-depth analytics knowledge, this is a useful starting point.
What doesn't work:
- Complex product structures: In a shop with complex products, including fake variants, Sidekick calculated sales figures completely incorrectly. The result showed an increase of 887%, while the actual development was declining. The product logic was fundamentally misunderstood.
- Session data: Sessions were calculated incorrectly because Sidekick included product categories that should have been explicitly excluded.
- Checkout abandonments: The results did not match the manual analysis.
- Timeframes: In another shop, Sidekick only compared two individual days instead of the defined analysis period. This error was not directly recognizable in the output and would have gone unnoticed without cross-checking.
- Recommendations for action: Reading pure data points works best. As soon as Sidekick has to interpret context and derive recommendations, the quality significantly declines.
The central learning from this test: The more context Sidekick has to process, the less reliable the results become. For shops with individual structures, bundles, or custom logic, AI-supported analysis via Sidekick is currently not justifiable. Human expertise remains absolutely necessary here.
Rating: 3/10
Use Case 5: Creating Shopify Flows
Creating Shopify Flows is clearly the strongest use case of our entire test. For simple to medium-difficulty automations, Sidekick delivers consistent, directly usable results, even without prior knowledge of flow syntax or trigger logic.
What works:
- Simple and medium-difficulty flows: Anti-fraud flows, inventory alerts, email triggers, and similar automations are generated reliably. Sidekick understands the task and builds the flow structurally correctly.
- Independent detection: Sidekick independently recognizes when a flow is the appropriate solution for a request, without the word "Flow" having to be explicitly used in the prompt.
- Preview before adoption: Before a flow is adopted, Sidekick shows a preview. This allows checking the output before it becomes active.
- Transparency at limits: If a flow step requires Liquid code, Sidekick clearly indicates this and shows up to what point the implementation is possible without a developer.
- Template-based prompting: Copying an existing flow description and typing it directly as a prompt works particularly well. Those who already use flows can use this method to quickly replicate similar automations.
- No independent activations: Sidekick does not activate or delete flows independently. Final control remains with humans.
What doesn't work:
- Inconsistent tags: In some tests, tags were not set consistently, for example, "equal" instead of "include," or a tag was added twice. Outputs should be checked before activation.
- Very complex flows: Flows with multiple nested condition levels are difficult to describe in natural language. The effort required to formulate the prompt can negate the time savings.
Especially for merchants without flow knowledge, the added value is real. The prompt quality strongly determines the output: The more concrete the request, the better the result.
Rating: 7/10
Use Case 6: Theme Editor
For global standard settings in native Shopify themes, Sidekick is conditionally usable. Anything beyond predefined theme settings is prone to errors and time-consuming.
What works:
- Global standard settings: Changes to colors, badge displays, button styles, or product image formats work reliably, as long as these are settings that the theme natively offers. For changes that affect several schemas simultaneously, Sidekick actually saves clicks.
- Recognize available sections: Sidekick can identify the standard blocks of a theme and suggest suitable sections.
What doesn't work:
- Product-specific adjustments: Hiding the price for a single product is not possible via Sidekick. It only has access to global theme options, not product-specific ones.
- Custom CSS: Sidekick generates CSS code that had no effect in the tested cases. It has no actual insight into the theme code and guesses at CSS requests based on assumptions.
- Section management: In one test, Sidekick built a section on the wrong page, on the product page instead of the homepage, even though the homepage was explicitly mentioned in the prompt. It took four to six iterations to complete a simple task correctly.
- Navigation: Sidekick does not know which page is currently open in the Theme Editor. Navigation hints like "Click on the gear icon, then colors" are imprecise and sometimes incorrect. Unlike the Shopify Admin, Sidekick has little orientation in the Theme Editor.
- Link generation: In one test, Sidekick generated a link that led nowhere. Sidekick itself did not notice the error.
- Custom Themes: For individually customized themes, such as a client-specific modified Dawn theme, Sidekick has no insight into the code. Even for Shopify's own themes like Horizon, support beyond basic settings is not reliable.
The conclusion from this use case can be summarized briefly: Anything Sidekick can do in the Theme Editor can be done manually in roughly the same amount of time. Anything beyond that costs significantly more time through iterations and errors than it saves.
Rating: 2/10
3. Conclusion: What our big Shopify Sidekick test shows
Shopify Sidekick has a clearly defined area of strength: creating Shopify Flows. Here it delivers consistent, usable results and brings real added value even to inexperienced merchants. In all other tested areas, the weaknesses outweigh the benefits.
| Use Case | Rating | Brief assessment |
|---|---|---|
| General Shopify Questions | 4/10 | Useful as a guide, only usable with prior knowledge |
| B2B Customer Creation & Custom Logic | 2/10 | Basic setup possible, unreliable with custom logic |
| Data Maintenance & Bulk Upload | 2/10 | Breaks off, cannot delete, no metafield support |
| KPI Analysis & Data Evaluation | 3/10 | Simple key figures work, complex structures do not |
| Creating Shopify Flows | 7/10 | Clearly strongest use case, real added value without prior knowledge |
| Theme Editor | 2/10 | Only global standard settings, everything else prone to errors |
The biggest structural problem runs through all use cases: Sidekick hardly knows the actual shop context. Theme code, open pages, custom functions, individual tag logic, or specific metafield structures are beyond its access. When knowledge is lacking, the Shopify AI assistant prefers to hallucinate rather than admit it doesn't know. This unpredictability is not a minor problem, but the core risk when using Sidekick: You can never be sure whether an answer provides reliable guidance or confident half-knowledge.
4. Who is Shopify Sidekick suitable for?
We recommend Shopify Sidekick with clear limitations. Its use is worthwhile in certain scenarios, while in others it costs more time than it saves.
Suitable for:
- Merchants who want to create or adjust Shopify Flows, especially without technical prior knowledge
- Automations with a clearly describable process: anti-fraud checks, inventory alerts, email triggers
- Merchants looking for initial inspiration or a template for flows
- Simple, global theme settings like colors or image formats, where you don't want to search yourself
- General orientation questions about native Shopify features, if prior knowledge for cross-checking is available
Not suitable for:
- Shops with custom themes: Sidekick has no insight into individual theme code
- Bulk data maintenance for customers or products: too slow, too inconsistent, direct CSV import in the Admin is more efficient
- KPI analyses for shops with complex product structures, bundles, or custom logic: too prone to errors
- Individual theme adjustments beyond standard settings: direct developer involvement is more efficient here
- Anything that concerns custom functions, specific tags, or metafield structures: Sidekick does not know these logics
Sidekick is not a tool that replaces Shopify expertise. As of today, it is an assistant that can accelerate simple, well-defined tasks, but only if someone with sufficient prior knowledge can classify and cross-check the outputs.