E-Business Study Guide · IDSI12200
Lecture 1

Introduction to E-Business

Example: Taobao Villages
  • The first farming village to take up e-commerce on a large scale was Dongfeng Village in Shaji Town, Jiangsu Province
  • Households joined the digital economy by getting involved in furniture production and selling their finished goods online
  • What is a Taobao village?
  • Residents got started in e-commerce spontaneously primarily with the use of Taobao Marketplace
  • Total annual e-commerce transaction volume is at least RMB10 million ($1.6 million)
  • At least 10% of village households actively engage in e-commerce or at least 100 active online shops have been opened by villagers
  • 5425 such villages existed in 2020, generating a total revenue of RMB 1000 billion with about 30000 active online shops
  • Source: Alibaba’s Research Division
E-Business Opportunities
  • Reach
  • More than 6.04 billion Internet users globally, about 73.2% of the world’s population (Statista 2025)
  • Internet users spend 6 hours and 36 minutes online daily on average
  • Constant connectivity via digital devices (phones, tablets, smart watches, etc.)
  • Richness
  • Detailed product information on billions of pages indexed by Google, videos, feeds, reviews, etc.
  • Personalized messages for customers
  • Affiliation
  • Partnership is the key in the networked economy
Network Economic Structures
  • Strategic alliances (strategic partnerships)
  • Coordinate strategies, resources, skill sets
  • Form long-term, stable relationships with other companies and individuals
  • Based on shared purposes
  • Strategic partners
  • Come together for specific project or activity
  • Form many intercompany teams
  • Undertake variety of ongoing activities
  • E.g., Partnership between Starbucks and Spotify – Starbucks in-store music provided by Spotify
Network Economic Structures (cont’d.)
  • Network organizations
  • Well suited to information-intensive technology industries
  • Sweater example
  • Knitters organize into networks of smaller organizations
  • Specialize in styles or designs
  • E.g., H&M doesn’t own any factories, instead it works with over 800 independent suppliers
  • IT makes such networks easier to construct and maintain
E-Commerce and E-Business
  • Electronic commerce
  • Buying and selling of products using the Internet
  • Not only financial transactions between organizations and customers
  • All electronically mediated transactions between an organization and third parties
  • All pre-sale and post-sale activities across the supply chain
  • Electronic business
  • A broader term
  • Transformation of key business processes (e.g., research & development, marketing, manufacturing, logistics, etc.) using IT
Why Study E-Commerce?
  • E-Commerce brings fundamental changes to commerce
  • Traditional commerce:
  • Consumer as passive targets
  • Mass-marketing driven
  • Sales-force driven
  • Fixed prices
  • Information asymmetry
Unique Features of E-Commerce Technology
  • Ubiquity
  • Global reach
  • Universal standards
  • Information richness/density
  • Interactivity
  • Personalization/customization
  • Social technology
Buy-Side and Sell-Side E-Commerce
  • Buy-side e-commerce
  • Transactions to procure resources needed by an organization from its suppliers
  • Sell-side e-commerce
  • Transactions involved with selling products to an organization’s customers
Different Types of Sell-Side E-Commerce
  • Transactional e-commerce sites
  • Enable purchase of products online
  • Services-oriented relationship-building websites
  • Provide information to stimulate purchase and build relationships
  • Encourage offline sales and generate enquires from potential customers
  • Brand-building sites
  • Provide an online experience of the brand
  • Products not available for purchase
  • Portal or media sites
  • Provide information, news or entertainment about a range of topics
  • Social networks
  • Facilitate company and customer communications
Categories of E-Commerce/E-Business
  • Business-to-consumer (B2C)
  • Consumer shopping on the Web
  • Business-to-business (B2B): e-procurement
  • Transactions conducted between Web businesses
  • Supply management (procurement) departments
  • Negotiate purchase transactions with suppliers
  • Consumer-to-consumer (C2C)
  • Individuals buying and selling among themselves
  • Web auction site
  • C2C sales included in B2C category
  • Seller acts as a business (for transaction purposes)
  • Business-to-government (B2G)
  • Business transactions with government agencies
  • Paying taxes, filing required reports
Drivers of Business Adoption of E-Business
  • Cost/efficiency drivers
  • Increasing speed with which supplies can be obtained
  • Increasing speed with which goods can be dispatched
  • Reducing sales and purchasing costs
  • Reducing operating costs
  • Competitiveness drivers
  • Customer demand
  • Improving the range and quality of services offered
  • Avoid losing market share to businesses already using e-commerce
Risks and Barriers to Business Adoption of E-Business
  • Website failure because of traffic overload
  • System security
  • Customer privacy
  • Problems with order fulfillment
  • Poor customer service
  • Costs (set-up and running)
Future of E-Business
  • Source: https://www.gartner.com/en/articles/top-technology-trends-2026
Five IT Megatrends in the Information Age: Mobile Computing
  • We are living in a post-PC era
  • In the developing countries, mobile devices often leapfrog traditional PCs
  • Implications:
  • Increased collaboration
  • The ability to manage business in real time
  • New ways to reach customers
Example: Burger King’s “Hangover Whopper”
  • The mobile app uses a face-scanning technology to determine a user’s hangover level and offer a discount
  • Source: https://gizmodo.com/the-digital-creator-starter-pack-the-investments-criti-1851058717
Five IT Megatrends in the Information Age: Social Media
  • About 3 billion monthly active Facebook users share status updates or pictures with friends and family
  • Organizations use social media to encourage employee collaboration or to connect with their customers
Benefits of Social Media
  • Source: https://www.brafton.com/blog/social-media/12-benefits-of-social-media-and-all-the-ways-it-can-impact-your-business-for-good/
Five IT Megatrends in the Information Age: The Internet of Things
  • Devices have embedded computers and sensors, enabling connectivity over the Internet
  • More devices are connected to the Internet than people living on earth
  • The Internet of everything?
Example 1: Smart Home
  • Appliances are equipped with sensors and Internet-connected
  • Enabling remote monitoring and management of appliances, e.g., lighting and heating
  • Smart home systems and devices often operate together and share data among themselves
  • Examples:
  • Google Home, Philips smart lightbulbs, washing machines, etc.
Example 2: Smart Tattoo
  • A temporary tattoo that can communicate with devices and transmit data over Bluetooth, like a wearable device
  • The current prototype focuses on fitness and health tracking (e.g., tracking steps or monitoring heart rate)
Five IT Megatrends in the Information Age: Cloud Computing
  • Web technologies enable using the Internet as the platform for applications and data
  • Many regard cloud computing as the beginning of the “fourth wave”
  • Applications that used to be installed on individual computers are increasingly kept in the cloud
  • e.g., Gmail, Google Docs, Google Calendar
Five IT Megatrends in the Information Age: Big Data
  • On average, every human created at least 1.7MB of data per second
  • 463 exabytes of data will be generated each day by people as of 2025
  • 1 exabytes=1 billion GB
  • IDC estimates that as much as 33% of all data will contain information that might be valuable if analyzed
  • Companies in the Information Age economy are creating value not from people, but from data.
Example: Netflix Analyzing Our TV-Watching Habits
  • Netflix looked for the episode that, after viewing, kept 70 percent of people on board for the rest of the season
  • Examples:
  • Breaking Bad — Episode 2
  • How I Met Your Mother — Episode 8
  • Orange is the New Black — Episode 3
  • “We found that no one was ever hooked on the pilot,” Ted Sarandos, Netflix's content chief, says in a statement
  • Some countries get hooked sooner, others take longer
  • “The Dutch, for instance, tend to fall in love with series the fastest, getting hooked one episode ahead of most countries irrespective of the show,”;
  • "members in Australia and New Zealand [got] hooked one to two episodes later than the rest of the world on almost every show."
  • Conclusion: releasing an entire season of a TV show all at once is a better way to win over fans
E-Commerce Trends
  • Continued growth in retail e-commerce
  • Continued expansion of mobile, social, and local e-commerce
  • On-demand service firms show explosive growth
  • Continued growth in Big Data and business analytics
  • Continued growth of user-generated content on social networks, blogs, wikis
  • Government surveillance of online communications expands
Predictions for the Future
  • Technology will propagate through all commercial activity
  • Large, traditional companies will continue to play dominant role, consolidating audiences
  • Start-up ventures can still attract large audiences in non-dominated arenas
  • Concerns about increasing market dominance of Amazon, Google, Meta (Big Tech)
  • Additional factors:
  • Increased regulation and control
  • Online security issues
  • Cost of energy
The Role of Digital Technologies in Business
  • Growing interdependence between:
  • Ability to use technologies
  • Ability to implement corporate strategies and achieve corporate goals
  • Changes in strategy, rules, and business processes increasingly require changes in hardware, software, databases, and telecommunications. Meanwhile, what the organization would like to do depends on what its systems will permit it to do.
Discussion: What are the Roles of IT in These Companies?
  • Amazon.com
  • Google
  • Netflix
  • Air France
The Role of Digital Technologies in Business
  • Business firms invest heavily in technologies to achieve six strategic business objectives:
  • Operational excellence
  • New products, services, and business models
  • Customer and supplier intimacy
  • Improved decision making
  • Competitive advantage
  • Survival
  • (1) Operational excellence
  • Improvement of efficiency to attain higher profitability
  • Information systems, technology an important tool in achieving greater efficiency and productivity
  • Example: Walmart
  • Walmart’s Retail Link system links suppliers to stores for superior replenishment system
  • Combining information systems and best business practices to achieve operational efficiency
  • (2) New products, services, and business models
  • Information systems and technologies enable firms to create new products, services, and business models
  • Business model: describes how company produces, delivers, and sells product or service to create wealth
  • Example: Apple
  • Transformed old model of music distribution with iTunes and Apple Music streaming service
  • Constant innovations—iPod, iPhone, iPad, Apple Vision, etc.
  • (3) Customer and supplier intimacy
  • Serving customers well leads to customers returning, which raises revenues and profits
  • Example:
  • High-end hotels (e.g., Marriott) that use computers to track customer preferences and used to monitor and customize environment
  • Intimacy with suppliers allows them to provide vital inputs, which lowers costs
  • Example:
  • Dell’s information system which links sales orders to suppliers of components
  • (4) Improved decision making
  • Without accurate information:
  • Managers must use forecasts, best guesses, luck
  • Results in:
  • Overproduction, underproduction
  • Misallocation of resources
  • Poor response times
  • Poor outcomes raise costs, lose customers
  • Real-time data improves ability of managers to make decisions
  • Example:
  • Verizon’s Web-based digital dashboard to provide managers with real-time data on customer complaints, network performance, and line outages
  • (5) Competitive advantage
  • Delivering better performance
  • Charging less for superior products
  • Responding to customers and suppliers in real time
  • Examples: Apple, Walmart, UPS
  • …But with a risk
  • IS gone wrong – American Airlines computer glitch
  • Caused 700 delayed flights, 125K affected passengers, FAA flight halt, public apology from CEO
  • IS done right – FedEx tracking system
  • Continuous update and fine-tuning provides high-quality package tracking and enables FedEx to become a global leader in express transportation
  • (6) Survival
  • IT as necessity of business
  • Industry-level changes
  • Example:
  • Mobile payment in China
  • AI-assisted programming
  • Governmental regulations and reporting requirements
  • Example:
  • Sarbanes-Oxley Act – which aims to protect investors by improving the accuracy and reliability of corporate disclosures
Identifying E-Business Opportunities
  • View a firm as multiple business units – e.g., accounting, marketing, HR, etc.
  • Focus on specific business processes
  • Break business processes down into a series of value-adding activities
  • Which together generate profits and meet other goals
Strategic Business Unit Value Chains
  • Value chain (introduced by Michael Porter)
  • Organizing strategic business unit activities to design, produce, promote, market, deliver, and support the products or services
  • Primary activities performed in a strategic business unit
  • Identify customers, design, purchase materials and supplies, manufacture product or create service, market and sell, deliver, provide after-sale service and support
  • Importance depends on:
  • Product or service provided
  • Customers
  • Supporting activities performed by the central organization
  • Financial and administration, human resource management and technology development
  • Provide the infrastructure for primary activities
Value Chain for a Strategic Business Unit
  • Left-to-right flow - Does not imply strict time sequence
Role of IS in Value Chain Analysis
  • Value Chain Analysis
  • The process of analyzing a firm’s activities to determine where value is added to products or services
  • Used to identify opportunities where information systems can be used to gain a competitive advantage
Example: Amazon.com
  • Source: Kandermirli, B. (2018). Amazon.com’s digital strategies – Amazon.com case study
  • Example: Big data analytics can help to identify potential customers to make promotions more targeted and effective.
What is Sharing Economy?
  • The sharing economy is defined as an economic system in which assets and services are shared between private individuals
  • Based on pooling and exchanging services, resources, goods, time, knowledge, skills, etc.
  • Also known as the collaborative economy
  • Examples: Uber, Airbnb
  • Key concepts: three key differentiation strategies
  • Technology
  • Partnership
  • User experience
Concept-Checking Questions
  • The concept-checking questions serve to guide you through the readings and video cases
  • The readings are usually case studies that provide you a background understanding of a topic and the key concepts
  • The video cases provide additional examples that allow you to apply the concepts
  • Try your best to reflect on the concept-checking questions
  • We will have a discussion in the next class
Summary
  • Definitions of e-commerce and e-business
  • Key drivers and barriers to e-commerce/e-business adoption by organizations and consumers
  • Role of digital technologies in business
  • Value chain analysis for identifying e-business opportunities
  • Sharing economy
Lecture 2

E-Business Revenue Models

Learning Objectives
  • A quick recap: sharing economy
  • Understand the key elements of a business model
  • Understand what a revenue model is and how companies use various revenue models
  • Discuss revenue strategy issues that companies face when selling online
  • Use the business model canvas to understand online businesses
  • Class exercise: Case study on Feed. and Temu
What is Sharing Economy?
  • The sharing economy is defined as an economic system in which assets and services are shared between private individuals
  • Based on pooling and exchanging services, resources, goods, time, knowledge, skills, etc.
  • Also known as the collaborative economy
  • Examples: Uber, Airbnb
  • Three key differentiation strategies
  • Technology
  • Partnership
  • User experience
Example of Uber
  • Which differentiation strategy does Uber adopt?
  • Technology / Partnership / User experience
  • What kind of value does Uber create?
  • Functional / Experiential / Social
  • How does Uber leverage users for its differentiation strategy?
  • Example of user inputs: pick-up/drop-off locations
  • Other examples?
Example of Turo
  • Which differentiation strategy does Turo adopt?
  • Technology / Partnership / User experience
  • What kind of value does Turo create for users?
  • Functional / Experiential / Social
  • Do you think the power hosts pose a threat to Turo’s business?
  • Yes / No / It depends
  • If Turo decides to regulate the activities of power hosts, what should Turo do?
  • Limit the number of cars for rent from a single host
  • A multi-tier host system – different amounts of commission
Example of DogVacay
  • Which differentiation strategy does DogVacay adopt?
  • Technology / Partnership / User experience
  • What kind of value does DogVacay create for users?
  • Functional / Experiential / Social / Experiential+Social
  • Why is it important for DogVacay to create a community?
  • Trust
Starting Example: Pinterest
  • What is Pinterest?
  • http://www.youtube.com/watch?v=oJzD4vF5dFA
How Much is Pinterest Worth?
  • Main product is the digital equivalent of cutting photos out of magazines
  • 445 million monthly active users worldwide
  • Launched in 2010
  • Worth $3.8 billion in 2013
  • Went public in 2019 at $10 billion valuation
  • $3.06 billion in revenue in 2023, 9% increase from 2022
  • How does the company make money?
  • In 2013, Ben Silberman (Pinterest’s chief executive) said:
  • “Right now we don’t. But the big-picture assumption of the company is that there is a direct link between the things you pin and the things that you eventually spend money on. In there, we think, lies a model where we can actually make Pinterest more useful. And we can help businesses by bringing in more customers and helping them sell things and connect with people.”
Some Statistics
  • Demographics:
  • 14th largest social network in the world
  • 60% of users are women
  • Women aged 25-34 represent 29% of Pinterest audience
  • High-income and educated households are more likely to use Pinterest
  • Usage:
  • 82% of users are on mobile
  • 85% of users use Pinterest to plan new projects
  • Searches for ”Christmas gift ideas” start as early as April
  • 97% of top searches are unbranded
  • 8 in 10 users say Pinterest makes them feel positive
  • Source: https://blog.hootsuite.com/pinterest-statistics-for-business/
Promoted Pins
  • Pinterest has been approaching advertising very carefully
  • Began by experimenting with “promoted pins” in 2013
  • Pricing model varies across objectives (e.g., brand awareness, referral, app install)
Buyable Pins
  • Pinterest introduced Buyable Pins in 2015
  • Pinners can buy from large retailers like Macy’s, Neiman Marcus and Nordstrom
  • Small businesses can sell using Buyable Pins with the help of e-commerce service providers (e.g., Shopify)
  • Currently Pinterest is not charging a fee to buy or sell on the site
  • Sellers still handles shipping and customer service
  • 55% of Pinterest users shop on the site
Visual Discovery Tool
  • Introduced the Visual Discovery Tool in 2017
  • Users can take a photo with their mobile phones and look for Related Pins
  • Ads of the searched items will be shown
Promoted Video Pins & Idea Pins
  • Introduced Promoted Video Pins in 2016
  • Video pins are in MP4 format
  • Autoplay when they are 50% in view
  • Introduced Idea Pins in 2021
  • Allow creators to record and edit creative videos with up to 20 pages of content
  • Source: https://techcrunch.com/2021/05/18/pinterest-introduces-idea-pins-a-video-first-feature-aimed-at-creators/
E-commerce Business Models
  • Business model
  • Set of planned activities designed to result in a profit in a marketplace
  • E-commerce business model
  • Uses/leverages unique qualities of Internet and Web
Key Elements of a Business Model
  • Value proposition
  • Revenue model
  • Market opportunity
  • Competitive environment
  • Competitive advantage
  • Market strategy
  • Organizational development
  • Management team
1. Value Proposition
  • “Why should the customer buy from you?”
  • Successful e-commerce value propositions:
  • Personalization/customization
  • Reduction of product search, price discovery costs
  • Facilitation of transactions by managing product delivery
  • Examples:
  • Amazon.com vs. book retailers
  • iTunes vs. music CD stores
  • Guiding analysis tool:
  • Value Proposition Canvas
Value Proposition Canvas
  • Source: https://www.strategyzer.com/library/the-value-proposition-canvas
  • Customer Profile
  • Value Map
  • The goal is to create a fit between the value map and the customer profile
2. Revenue Model
  • “How will you earn money?”
  • Major types of revenue models:
  • Sales revenue model (e.g., web catalog model)
  • Advertising revenue model
  • Subscription revenue model
  • Transaction fee revenue model
  • Freemium strategy
  • More details later
3. Market Opportunity
  • “What marketspace do you intend to serve and what is its size?”
  • Marketspace: Area of actual or potential commercial value in which company intends to operate
  • Realistic market opportunity: Defined by revenue potential in each market niche in which company hopes to compete
  • Market opportunity typically divided into smaller niches
Example: Google Shopping Insights
  • Focus on market niches
  • 83% of U.S. shoppers visited a store in the last week say they used online search before going into a store
  • Shopping Insights helps businesses by showing what stuff people in their cities are searching for the most
  • It breaks down search data by products, cities, and devices, and illustrates it in heat maps
  • Source: thinkwithgoogle.com
4. Competitive Environment
  • “Who else occupies your intended marketspace?”
  • Other companies selling similar products in the same marketspace
  • Includes both direct and indirect competitors, e.g.,
  • Direct competitors: Priceline vs. Travelocity
  • Indirect competitors: CNN.com vs. ESPN.com
  • Influenced by:
  • Number and size of active competitors
  • Each competitor’s market share
  • Competitors’ profitability
  • Competitors’ pricing
5. Competitive Advantage
  • “What special advantages does your firm bring to the marketspace?”
  • Is your product superior to or cheaper to produce than your competitors’?
  • Important concepts:
  • Asymmetries in financial backing, knowledge, information, and/or power (e.g., IBM, Apple, Google)
  • First-mover advantage (e.g., eBay, Amazon)
  • Network effect (e.g., Microsoft Office)
Example: Asian Tech Giants Lead US Patent Rankings
  • Samsung has led for the third year
  • U.S firms lagging in AI patents
  • IBM had the steepest decline in the top 10 –shifting its focus to collaboration and open innovation
  • Source: https://www.rdworldonline.com/asia-leads-in-u-s-patents-in-2024-while-u-s-grants-show-signs-of-recovery/
6. Market Strategy
  • “How do you plan to promote your products or services to attract your target audience?”
  • Details how a company intends to enter market and attract customers
  • Best business concepts will fail if not properly marketed to potential customers
7. Organizational Development
  • “What types of organizational structures within the firm are necessary to carry out the business plan?”
  • Describes how firm will organize work
  • Typically, divided into functional departments (e.g., accounting, marketing, IT, etc.)
  • As company grows, hiring moves from generalists to specialists
  • The Internet/technologies facilitate intra- and inter-organizational processes
  • E.g., outsourcing, offshoring, virtual teams, etc.
8. Management Team
  • “What kind of backgrounds should the company’s leaders have?”
  • A strong management team:
  • Can make the business model work
  • Can give credibility to outside investors
  • Has market-specific knowledge
  • Has experience in implementing business plans
  • Executive-level roles:
  • CIO (Chief Information Officer) – aiming to improve internal processes
  • CTO (Chief Technology Officer) – aiming to innovate products for customers
  • General and functional managers must
  • Understand the role IT plays in an information system
  • Be able to identify opportunities to use IT to their organization’s advantage
Categorizing E-Commerce Business Models
  • No single correct way
  • Name categories according to:
  • Sector (e.g., B2C)
  • Technology (e.g., mobile commerce)
  • Similar business models appear in more than one sector
  • Some companies use multiple business models (e.g., eBay)
Examples of B2C Business Models
  • E-tailer (e.g., Amazon)
  • Community provider (e.g., Facebook, LinkedIn)
  • Content provider (e.g., Netflix)
  • Portal (e.g., Yahoo)
  • Transaction broker (e.g., travel service)
  • Market creator (e.g., Uber, Airbnb)
  • Service provider (e.g., legal service)
What is a Revenue Model?
  • Revenue Models describe different techniques for generation of income
  • Mainly based upon the income from sales of products or services
  • Either for selling direct from the manufacturer or supplier of the service or through an intermediary that will take a cut of the selling price
Disintermediation Brought by the Internet
  • Disintermediation of a consumer distribution channel showing (a) the original situation, (b) disintermediation omitting the wholesaler, and (c) disintermediation omitting both wholesaler and retailer
Revenue Models for Online Business
  • Web business revenue-generating models
  • Web catalog
  • Digital content
  • Advertising-supported
  • Advertising-subscription mixed
  • Fee-based
  • Same model can work for both sale types
  • Business-to-Consumer (B2C)
  • Business-to-Business (B2B)
Web Catalog Revenue Models
  • Adapted from mail-order (catalog) model
  • Seller establishes brand image
  • Printed information mailed to prospective buyers
  • Orders placed by mail or phone
  • Expands traditional model
  • Replaces or supplements print catalogs
  • Orders placed through Web site
  • Creates additional sales outlet for existing companies
Web Catalog Revenue Models (cont’d.)
  • Two main types of online retailers
  • Discounters began as online retailers, e.g., Overstock.com
  • Traditional retailers now use Web catalog revenue model, e.g., Costco, Kmart, Target, and Walmart
  • Using multiple marketing channels
  • Allows more customers to be reached at a lower cost
  • Marketing channel examples
  • Physical stores
  • Web sites
  • Mailed catalogs or newspaper insert
  • Mobile phones
Walmart’s Holiday Toy Catalog
  • Directing customers to the website
Web Catalog Revenue Models (cont’d.)
  • The challenge: adding the personal touch
  • Online shopping lacks the in-store experience
  • Some solutions:
  • Interactive websites
  • Display products with rich information (e.g., photos, reviews)
  • Capability to search and filter
  • Online assistance
  • Online text chat (with human)
  • Chatbots
  • Virtual experience
  • Virtual try-on (e.g., for clothes)
Fee-for-Content Revenue Models
  • Firms owning written information or information rights
  • Embrace the Web as a highly efficient distribution mechanism
  • Use the digital content revenue model
  • Sell subscriptions for access to information they own
  • Examples:
  • Legal, academic, business and technical content
  • LexisNexis: offers variety of information services for lawyers and law enforcement officials
  • Academic information aggregation services
  • ProQuest and EBSCO: purchase rights and resell rights in subscription packages to schools, libraries, companies, and not-for-profit institutions
  • Dow Jones provides business-focused publications online
  • Newspaper, magazine, and journal materials
  • Factiva: subscriptions for purposes of business research, job searches, or investment analysis
Fee-for-Content Revenue Models (cont’d.)
  • Electronic books
  • Market leaders: Amazon’s Kindle products, Kobo, and Google Play Books
  • Sales include:
  • Books (sold individually or subscriptions, e.g., Kindle Unlimited)
  • Magazines and newspapers subscriptions
  • Online music/video
  • Examples: Apple Music, Spotify, Netflix, Amazon Prime Video
  • Pricing model changing from individual sales to subscription (streaming)
  • New technologies improving delivery and experience
  • Fear of online sales impairing other sales types  Companies incorporating online distribution into revenue strategy
Advertising as a Revenue Model Element
  • Advertisers’ fees in place of users’ subscriptions
  • Advertising-supported revenue models
  • Traditionally used by TV broadcasting companies
  • Provides free programming and advertising messages
  • Supports network operations sufficiently
Advertising-Supported Revenue Models (cont’d.)
  • Problem 1: measuring and charging site visitor views
  • Stickiness
  • Keeping visitors at site and attracting repeat visitors
  • Exposed to more advertising in a sticky site
  • Problem 2: obtaining large advertiser interest
  • Requires demographic information collection
  • Characteristics set used to group visitors
  • Firms can obtain large advertiser interest by:
  • General interest strategy:
  • Draw a large number of undifferentiated visitors
  • E.g., Yahoo! portal
  • Specific interest strategy:
  • Draw a specialized audience certain advertisers want to reach
  • E.g., Engadget, The Verge (“techie”), NBA.com (“sports fans”)
Advertising-Subscription Mixed Revenue Models
  • Advertising-subscription mixed revenue models
  • Subscribers pay fee and accept some advertising
  • Less advertising than advertising-supported sites
  • Newspapers/magazines using advertising-subscription mixed revenue model offer varying proportions of free content
  • Example 1: ESPN
  • Leverages brand name from cable television business
  • Sells advertising; offers free information
  • Mixed model includes advertising and subscription revenue (collects insider subscriber revenue)
  • Example 2: Netflix
  • A standard plan with ads costs less, but users will see a few short ads per hour
Assessing the Effectiveness of Advertising Revenue Models
  • Cost per thousand impressions (CPM; M stands for “mile”, which means thousand in Latin)
  • Cost per 1000 estimated views (impressions) of the ad
  • A single web page can contain multiple ads
  • Too many = bad experience; too few = reduced revenue
  • Click-through rate (CTR)
  • The number of clicks on an ad divided by the number of times the ad is shown
  • Revenue per click (RPC)
  • Particularly important for affiliate marketers who make money through commission by directing visitors to third-party sites
  • Stickiness (visitor engagement)
  • Page views per visit
  • More page views = more opportunities for ad revenue
  • 5-10 for a typical site; >30 for a social networking site
Fee-for-Transaction Revenue Models
  • Online intermediary (fee-for-transaction Web site)
  • Offers visitor transaction information
  • Service fee charged based on transaction number or size
  • Examples:
  • Stock brokers
  • Insurance brokers
  • Online banking
  • Event tickets
  • Travel (e.g., Expedia, WaveHunters.com) – revenue from commissions and advertising
  • Disrupters: sharing economy companies
  • Uber
  • Airbnb
Fee-for-Service Revenue Models
  • Companies offer Web service
  • Fee based on service value
  • Not a broker service
  • Not based on transactions-processed number or size
  • Online professional services
  • Legal consultations
  • Accounting/taxation services
  • Prevalence of Web sites presenting health information
  • Difficulty of diagnosing without physical exam
  • Online consultations market is growing (revenue to reach €9.24bn in 2025)
  • E.g., Feeli, Mobi Doctor
Free for Many, Fee for a Few
  • Economics of manufacturing
  • Different for physical and digital products
  • Unit cost high percentage of physical products
  • Unit cost very small for digital products
  • Leads to a different revenue model
  • Offer basic product to many for free
  • Charge a fee to some for differentiated products
  • Examples: Google e-mail accounts
  • Inverse logic applied to physical products: free samples to entice sales (cookie samples)
Freeconomics Value Proposition
  • Free doesn’t mean no profit
  • Example 1: Google gives away search
  • Users give Google search results their attention
  • This can include attention to sponsored links
  • Google sells space for sponsored links
  • Advertisers pay Google for that attention to sponsored links
  • Some users convert into customers
  • Customers pay advertising firms for their products
  • Example 2: Adobe Reader software is offered free
  • This makes PDF documents a widely accepted standard
  • Companies need to buy the Acrobat software to edit PDF documents
How Freeconomics Works
  • Freeconomics is the leveraging of digital technologies to provide free goods and services to customers as a business strategy for gaining a competitive advantage
  • Two key considerations:
  • Marginal costs for digital services (e.g., storage price, reproduction costs) decrease tremendously over years
  • Revenue per user increases as there are more ways to exploit consumers (e.g., targeted ads)
Freemium Model
  • Core idea:
  • Basic features or limited usage are offered for free
  • Premium features or increased usage limits are offered for a fee (usually a subscription)
  • Key characteristics:
  • Free basic tier – to attract a large user base
  • Upsell opportunities – to incentivize users to upgrade to a paid plan for enhanced features and experience
  • Data collection – to gather data from free users for product improvement and targeted marketing
  • Examples: Spotify, Dropbox, mobile games
Challenges for Freemium Model
  • High computational costs for some applications (e.g., ChatGPT)
  • Risk of overuse and abuse – overwhelming servers and leading to misuse (e.g., spam, misinformation)
  • Potential for “freemium trap” – users become reliant on the free tier
Changing Strategies: Revenue Models in Transition
  • Companies must change revenue model
  • To meet needs of new and changing Web users
  • Some companies created e-commerce Web sites
  • Needed many years to grow large enough to become profitable (CNN and ESPN)
  • Some companies changed model or went out of business
  • Due to lengthy unprofitable growth phases
Multiple Changes to Revenue Models (cont’d.)
  • New York Times Web site’s ability to adapt
  • 1990s: advertising supported with subscription fee for specific access to premium crosswords, chess column and archived articles
  • 2005: additional content required subscription
  • 2007: returned to advertising-supported with free access
  • 2011: mixed revenue model
  • First 20 articles/month free
  • Subscription plans for continued access
  • Pay wall: barrier triggered by specific usage level
  • 2012: decreased to 10 free articles/month
  • 2017: decreased to 5 free articles/month
  • Now: a dynamic meter with a registration wall (soft conversion) and a pay wall
  • Source: https://www.inma.org/blogs/ideas/post.cfm/new-york-times-uses-machine-learning-to-create-a-smarter-paywall
Revenue Strategy Issues for Online Businesses
  • Channel conflict
  • Company’s Web site sales activities interfere with existing sales outlets
  • Channel cooperation is the key to success
  • Many online retailers do not accept returned purchases at physical retail stores (e.g., C&A, Marks and Spencer)
  • Showrooming
  • “Bertice Jenson couldn’t believe how shameless they were. Right in front of her in the Benjy’s superstore in Oklahoma City, a young couple pointed a smartphone at a Samsung 50-inch Ultra HD TV and then used an app to find an online price for it. They did the same for a Sony and an LG LED model, as the Munchkins from The Wizard of Oz danced across all three screens.”
  • Some anti-showrooming tactics
  • Improve customer service, ask the manufacturers to create exclusive units, ask the manufacturers to pay a fee for showcasing, focus on installation and repairs, emphasize instant gratification
  • Source: Case Study: Can Retailers Win Back Shoppers Who Browse then Buy Online?
Revenue Strategy Issues for Online Businesses (cont’d.)
  • Some goods used to be difficult to sell online
  • Example: Luxury goods
  • Customers want to see product in person or touch
  • Online jewelry sales have grown rapidly
  • Blue Nile and IceTrends
  • Supported by:
  • General availability of independent appraisal certificates
  • “No questions asked” return policies
Business Model Canvas
  • A business model describes the rationale of how an organization creates, delivers, and captures value
  • Business model canvas is a visual overview of the key building blocks of a business model
  • A two-minute explanation of the business model canvas:
  • https://www.strategyzer.com/library/the-business-model-canvas
Nine Building Blocks of Business Model Canvas (Consider Each Block in the Given Sequence)
  • Customer segments
  • Who is your most important customer?
  • Which classes are you creating values for?
  • Value propositions
  • What core value do you deliver to the customer?
  • Which customer needs are you satisfying?
  • Channels
  • Through which channels that your customers want to be reached?
  • Which channels work best? How much do they cost? How can they be integrated into your and your customers’ routines?
Nine Building Blocks of Business Model Canvas (cont’d)
  • Customer relationships
  • What relationship that the target customer expects you to establish?
  • How can you integrate that into your business in terms of cost and format?
  • Revenue streams
  • For what value are your customers willing to pay?
  • What and how do they recently pay? How would they prefer to pay?
  • How much does every revenue stream contribute to the overall revenues?
  • Key resources
  • What key resources does your value proposition require?
  • What resources are important the most in distribution channels, customer relationships, revenue stream, etc.?
Summary
  • Various web revenue models (web catalog, digital content, advertising-supported, advertising-subscription mixed, fee-based)
  • Revenue strategy issues (channel conflict, showrooming)
  • Value Proposition Canvas and Business Model Canvas
Lecture 3

E-Business Infrastructure

Learning Objectives
  • A quick recap: case study on Feed. and Temu
  • Introduce the basics of web infrastructure
  • Discuss some infrastructure management issues
  • Explain the concept of cloud computing
  • Case: Commonwealth Bank of Australia
  • Exercise: Case study on Google Cloud Platform and Riot Games
Case Study of Feed.
  • Feed. is a Paris-based foodtech that develops smart food which it sells online
  • Revenue model of Feed.?
  • Sales revenue model
  • Advertising revenue model
  • Subscription revenue model
  • Transaction fee revenue model
  • Freemium model
Case Study of Temu
  • Questions:
  • What is the core e-commerce value proposition of Temu?
  • What is identified as a major risk area for Temu’s business model?
  • How might Amazon and other established companies adapt their business models to retain market share, given the shift toward low-price, slower-delivery competitors like Temu?
Questions
  • Major risk areas for Temu’s business model
  • Product safety, quality, and regulatory concerns
  • How should other companies react?
  • Competing on price (e.g., Amazon Haul)
  • Flexible shipping options for sellers
  • Customer loyalty and trust (e.g., return policies, buyer protections)
  • Ethical sourcing and sustainability
Example: Amazon.com
  • Amazon has grown from an online bookstore to a complete marketplace
  • Amazon uses information systems (IS) to optimize processes, and now provides its IS to others
  • Amazon Web Services (AWS) is an IS infrastructure rented to companies for their enterprise system needs
  • AWS provides cloud services and hosting for other companies
Some Interesting Stats About Amazon
  • Amazon went from 1 category (Books) to over 30 main categories and 25,000 Sub-Categories
  • 200M+ items sold on Prime Day in 2024 (total sales=$14.2B; average order=$58)
  • A one-hour Amazon outage costs $34 million in sales ($9615/second)
  • Some suggested that 85% of the world’s products are available on Amazon
  • 85% of drunk purchases were made on Amazon
AWS Powered Prime Day
  • Substantial increases in storage, I/O operations and data transfer supported by AWS
  • Source: https://aws.amazon.com/blogs/aws/how-aws-powered-prime-day-2024-for-record-breaking-sales/
E-Business Infrastructure
  • E-business infrastructure refers to the combination of:
  • Hardware such as servers and client PCs
  • Network used to link the hardware
  • Software applications used to deliver services to workers within the e-business and also to its partners and customers
Web Server Basics
  • Basic technologies to build online business Web sites
  • Server software and hardware
  • Utility function software (e.g., order entry, payment processing)
  • Servers
  • Have more memory and larger, faster disk drives
  • Main job: respond to Web client requests
  • Computer providing files, making programs available to other computers connected to it through a network
  • Web browser software
  • Make computers work as Web clients
  • E.g., Edge, Chrome, Safari
Web Client/Server Architectures
  • Computer processing is split between client machines and server machines linked by a network.
  • Users interact with the client machines.
Web Client/Server Architectures (cont’d.)
  • In a multi-tiered client/server network, client requests for service are handled by different levels of servers.
Multiple Meanings of “Server”
  • Web server
  • Runs Web server software
  • Makes server’s files available to other computers
  • E-mail server
  • Handles incoming and outgoing e-mail
  • Database server
  • Runs database management software
  • Transaction server
  • Runs accounting and inventory management software
Platform Neutrality
  • Ability of a network to connect devices that use different operating systems
  • Critical in rapid spread of information and services
Primary Client: Mobile Devices
  • Primary Internet access is now through:
  • Tablets – supplementing PCs for mobile situations
  • Smartphones – considered a disruptive technology
  • Replacing cell phones and PDAs
  • Availability of apps (e.g., camera, GPS, MP3 player)
  • Shift in processors, operating systems
  • Example: Google has used mobile-friendliness as a ranking criterion since 2015
  • If a website isn’t mobile-friendly, it will see a decrease in search engine rankings
  • In 2023, Google clarified that no single factor determines rankings alone. Multiple aspects of page experience, including mobile-friendliness, should be considered when optimizing websites for search.
Infrastructure Management Issues
  • Storage needs
  • Demand fluctuations
  • Scalability
  • Cost management
Big Data and Increasing Storage Needs
  • Firms collect unprecedented levels of data
  • Big Data Analytics
  • Legal Compliance (e.g., Sarbanes Oxley)
  • Unprecedented levels of data require unprecedented infrastructure capabilities
  • Capturing the data requires more infrastructure
  • Storing the data requires more infrastructure
  • Analyzing the data requires more infrastructure
Demand Fluctuations
  • Many companies face demand fluctuations
  • Seasonal fluctuations (e.g., Christmas)
  • Monthly fluctuations (e.g., month-end spikes for stock market)
  • Demand fluctuations create inefficiencies
  • Some estimate up to 70% of IS capacity only used 20% of the time
  • IS infrastructure is typically not readily scalable
  • Changing internal capacity takes time
Capacity Planning and Scalability
  • Capacity planning
  • Process of predicting when hardware system becomes saturated
  • Ensuring firm has enough computing power for current and future needs
  • Factors include:
  • Maximum number of users
  • Impact of current, future software
  • Performance measures
  • Scalability: ability of system to expand to serve large number of users without breaking down
Total Cost of Ownership (TCO) Model
  • Used to analyze direct and indirect costs to help determine the actual cost of owning a specific technology
  • Direct costs: hardware, software purchase costs
  • Indirect costs: ongoing administration costs, upgrades, maintenance, technical support, training, utility, and real estate costs
  • Hidden costs: support staff, downtime, additional network management
Cloud Computing as a Solution
  • Firms and individuals obtain computing power and software over Internet
  • Example: Google Apps, Microsoft Office 365, Oracle ERP Cloud
  • Fastest growing form of computing
  • Public, private, and hybrid clouds
  • Radically reduces costs of:
  • Building and operating Web sites
  • Infrastructure, IT support
  • Hardware, software
What is Cloud Computing?
  • Cloud Computing is a way to allocate resources much like a utility sells power
  • Resources are used “on-demand”, as needed
  • Customers only pay for what they consume
  • Resources can be rapidly allocated and reallocated
  • Consumption becomes an operating expense
  • % Utilization and Efficiency increase dramatically
Cloud Computing Platform
  • In cloud computing, hardware and software capabilities are provided as services over the Internet.
  • Businesses and employees have access to applications and IT infrastructure anywhere at any time using an Internet-connected device.
  • Major cloud service providers include Amazon, Microsoft, Google, IBM, Oracle, Alibaba (in Asia).
Why Cloud Computing?
  • The efficiency benefits are tremendous
  • Different customers have different demand spikes
  • Large data centers have economies of scale
  • Purchasing, deploying, and managing technology
  • Implementing green cooling technologies
  • Flexibly reallocating resources
  • Customers can focus on core operations
  • Infrastructure can be consumed as needed
  • Scalability no longer a limiting factor
Cloud Computing Service Models
  • Software as a Service (SaaS)
  • Hosts preinstalled applications which users just buy access to
  • Platform as a Service (PaaS)
  • Hosts an environment which programs can be executed on
  • Infrastructure as a Service (IaaS)
  • Hosts virtual machines which the customer installs an operating system on
Public and Private Clouds
  • Private clouds allow for rapid allocation of resources across internal needs
  • Public clouds offer additional advantages such as scalability and reliability, but may raise concerns about security and compliance
Managing the Cloud
  • Issues to consider (for either public or private cloud):
  • Availability/Reliability
  • Scalability
  • Viability
  • Security, Privacy, and Compliance
  • Diversity of offerings
  • Openness (e.g., open standards that are independent of vendor and platform)
  • Costs
Challenges with Cloud Adoption
  • Shifts responsibility for storage and control to providers
  • Security risks
  • Can introduce latency – delays in processing and transmitting of data
  • New policies and procedures for managing clouds
  • Contractual agreements with firms running clouds and distributing software required
Example: Amazon Elastic Compute Cloud (EC2) Service Level Agreement
  • Monthly Uptime Percentage of at least 99.99%
  • In case Amazon Elastic Compute Cloud (EC2) does not meet the Service Commitment, Service Credits will be paid as compensation
  • Service Credits are calculated as a percentage of the total charges paid
  • Example: Netflix Delivers Billions of Hours of Content Globally by Running on Amazon’s Cloud
Case: Netflix Running on AWS (cont’d.)
  • Netflix’s customers look at the artwork first and then decide whether to look at additional details
  • Question: How to improve the click-through rate for that first glance?
  • Goal: Select the best artwork for videos through A/B testing
  • Conclusion: Images that have expressive facial emotion that conveys the tone of the title do particularly well
  • The marked image significantly outperformed the other one
  • Source: https://medium.com/netflix-techblog/selecting-the-best-artwork-for-videos-through-a-b-testing-f6155c4595f6
Example: Zynga
  • Zynga
  • The developer of wildly popular Facebook applications like FarmVille, Mafia Wars, and many others
  • 183 million monthly active users in 2021
  • Problem:
  • When Zynga releases a new game, however, it has no way of knowing what amount of computing resources to dedicate to the game.
  • The game might be a mild success, or a smash hit that adds millions of new users.
  • Requirement:
  • The ability to design applications that can scale up in the number of users quickly
  • Solution:
  • Zynga uses Amazon’s cloud computing platform to launch new offerings.
  • It can pay only for the resources it ends up using, and once game traffic stabilizes and reaches a steady number of users, Zynga moves the game onto its private zCloud, which is structurally similar to Amazon’s cloud, but operates under Zynga’s control in data centers on the East and West coasts.
  • Zynga’s own servers handle 80 percent of its games.
Case: Commonwealth Bank of Australia
  • Multi-Provider Cloud Model
  • Background reading:
  • Schlagwein, D., Thorogood, A., & Willcocks, L. P. (2014). How Commonwealth Bank of Australia gained benefits using a standards-based, multi-provider cloud model. MIS Quarterly Executive, 13(4), Article 5.
CBA background
  • CBA is a large multinational bank headquartered in Sydney, Australia.
  • The Australian federal government founded CBA in 1911, privatizing it in 1991.
  • In 2014, CBA employed 50,000 people, of which 6,000 were in “IT and operations.”
  • CBA managed total assets of A$800 billion (approximately $750 billion).
  • CBA is among top 20 IT consumers worldwide.
CBA’s IT Sourcing Modes 1/4
  • Customer trust is critical for banks, and that trust is underpinned by
  • Security
  • Compliance
  • Availability of IT systems
  • Cloud computing affects all three of these critical components.
CBA’s IT Sourcing Modes 2/4
  • First mode: CBA outsourced most of its IT needs to Enterprise Data Systems in 1996 under a 10-year contract.
  • Unit cost elements
  • Long-term agreements
  • Fixed fees
  • Guaranteed volumes
CBA’s IT Sourcing Modes 3/4
  • By the early 2000s, CBA had reduced internal IT staff because EDS was providing almost all its IT services, and, as a consequence, it had lost some in-house IT capabilities.
  • This loss of capabilities was particularly challenging for the coordination and integration of new services.
  • CBA increasingly faced IT-related tensions between internal units as well as with EDS.
CBA’s IT Sourcing Modes 4/4
  • Second mode: CBA drew on the regained internal IT capabilities to move its IT sourcing to a multi-provider sourcing mode in 2006.
  • With this mode, the bank managed a range of IT providers with individual contracts.
  • At the end of the 2000s, CBA still had substantial IT infrastructure costs and was seeking further cost reductions.
  • Cloud computing promised a cost-effective, pay-as-you-go IT sourcing mode that could also help get new applications to market faster.
CBA Multi-Provider Cloud Model 1/4
  • Drivers
  • Cost savings: Pay-as-you-go models and competitive pricing from multiple providers.
  • Speed and agility: Rapid provisioning of new environments for faster time-to-market.
  • Scalability and flexibility: Handle varying workloads and adapt to changing needs.
  • Leadership support: Commitment from IT leadership to cloud adoption.
  • Successful prototype: Demonstrated the feasibility and benefits of cloud computing.
  • Barriers
  • Existing contracts: Tied to legacy providers and their expertise.
  • Internal resistance: Cultural and organizational barriers to change.
  • Security and availability concerns: Uncertainty about cloud security and reliability.
  • Regulatory constraints: Limitations on data storage in the cloud due to regulations.
  • Perception of existing solutions: Belief that virtualization or multi-provider sourcing already address some needs.
Challenges of Moving to the New Model
  • Provider Challenges:
  • Some cloud providers were willing to work with CBA's cloud standards.
  • One large provider initially refused to support CBA's standards.
  • Technology Challenges:
  • Implementing the cloud model required strong IT capabilities from CBA, including:
  • Ability to architect complex solutions
  • Develop standards
  • Integrate internal and external IT
  • Govern new application development
  • Applications with few technical barriers and few non-technical constraints were the first to move to the cloud using the multi-provider cloud model.
Lessons Learned
  • Engage in standard-setting efforts
  • Enforce client-defined cloud standards
  • Negotiate flexible, short-term arrangements
  • Retain sufficient internal IT capabilities
  • Be pragmatic about legacy applications
Exercise: Case Study on Google Cloud Platform and Riot Games
  • Video case #1 (Google Cloud Platform)
  • https://www.youtube.com/watch?v=KurQlPvNQ0k
  • Video case #2 (Riot Games)
  • https://aws.amazon.com/fr/awstv/watch/21d17afe2b3/
  • Things to consider:
  • Risks of relying on a single cloud provider
  • Should Riot Games adopt the CBA’s multi-provider cloud model?
Summary
  • Web infrastructure basics
  • Infrastructure management issues
  • Cloud computing
  • Case: Commonwealth Bank of Australia
  • Exercise: Case study on Google Cloud Platform and Riot Games
Lecture 4

Big Data & Blockchain

Learning Objectives
  • A quick recap: case study on Google Cloud and Riot Games
  • Explain four V’s of big data and basic terminology of business intelligence infrastructure (e.g., Hadoop, OLAP)
  • Explain the concept of data mining
  • Discuss some examples of big data applications (Web analytics, mobile marketing)
  • Explain what is blockchain and how it works
  • Discuss some use cases and examples of blockchain
  • Discuss some common myths about blockchain
  • Exercise: Case study on Glu Mobile and Stronghold
Exercise: Case Study on Google Cloud Platform and Riot Games
  • Video case #1 (Google Cloud Platform)
  • https://www.youtube.com/watch?v=KurQlPvNQ0k
  • Video case #2 (Riot Games)
  • https://aws.amazon.com/fr/awstv/watch/21d17afe2b3/
  • Things to consider:
  • Risks of relying on a single cloud provider
  • Should Riot Games adopt the CBA’s multi-provider cloud model?
Multiple-Choice Question #2
  • Which of the following was NOT mentioned in the video as a major benefit of GCP?
  • Free migration support for legacy systems
  • Advanced AI & machine learning tools
  • Global scalability & security
  • Pay-as-you-go pricing
Multiple-Choice Question #6
  • What technical challenge in “Valorant” was mitigated using AWS’s global cloud infrastructure?
  • Slow rendering of graphics
  • Insufficient audio quality
  • Peeker’s advantage due to latency
  • Limited AI capabilities
What are the risks of relying on a single cloud provider like what Riot Games does?
  • Service outages
  • Security risks
  • Vendor lock-in
  • Skills and tools become narrowly focused on one platform – difficult to switch to or integrate with other providers
Should Riot Games Adopt the CBA’s Multi-provider Cloud Model?
  • Reasons for Yes
  • Diversifying risk
  • Improving resilience
  • Avoiding vendor lock-in
  • Reasons for No
  • Ultra-low latency gaming experience requires a highly optimized cloud environment
  • Performance gains may be trivial – are there many cloud providers that are larger/better than AWS?
  • More difficult to manage multiple cloud providers
Challenge of Big Data (Four V’s)
  • Businesses are dealing with the challenge of “Big Data”
  • High Volume
  • Unprecedented amounts of data
  • High Variety
  • Structured data
  • Unstructured data
  • High Velocity
  • Rapid processing to maximize value
  • High Veracity
  • Accuracy/quality of data
Fundamentals of Big Data Analytics
  • Big Data by itself, regardless of the size, type, or speed, is worthless
  • Big Data + “Big” Analytics = Value
  • E.g., imagine how Google or Facebook deals with the user data
  • Big Data also bring about big challenges
  • Effectively and efficiently capturing, storing, and analyzing Big Data
  • New breed of technologies needed (developed or purchased; hired or outsourced)
  • Cloud computing is the mainstream solution
Big Data Considerations
  • You cannot process the amount of data that you want to because of the limitations of your current platform
  • You cannot include new/contemporary data sources (e.g., social media, RFID, Sensory, Web, GPS, textual data) because it does not comply with the data storage schema
  • The data is arriving so fast at your organization’s doorstep that your traditional analytics platform cannot handle it
  • You need to integrate data as quickly as possible to be current on your analysis
Business Intelligence Infrastructure
  • BI infrastructure provides a foundation for big data analytics
Data Warehouses
  • Data warehouse:
  • Database that stores current and historical data that may be of interest to decision makers
  • Consolidates and standardizes data from many systems, operational and transactional databases
  • Holds multiple subject areas
  • Holds very detailed information
  • Data mart:
  • Subset of data warehouses that is highly focused and isolated for a specific population of users
  • Often holds only one subject area, e.g., Finance or Sales
  • May hold more summarized data
Extract, Transform, and Load
  • Extracting is the process of obtaining the necessary data.
  • Transformation is done to ensure that the data are in a common format and are free of errors.
  • Finally, the extracted and transformed data are loaded into the warehouse for use by decision makers.
Hadoop
  • Open-source software framework for big data
  • Breaks data task into sub-problems and distributes the processing to many inexpensive computer processing nodes
  • Example: Google search is distributed to and processed by thousands of processors
  • Combines result into smaller data set that is easier to analyze
  • Key services
  • Hadoop Distributed File System (HDFS): data storage
  • MapReduce: breaks data into clusters for work
  • Hbase: NoSQL database running on Hadoop
Online Analytical Processing (O L A P)
  • Supports multidimensional data analysis, enabling users to view the same data in different ways using multiple dimensions
  • Each aspect of information—product, pricing, cost, region, or time period—represents a different dimension
  • E.g., comparing sales in the East region in June versus May and July
  • Enables users to obtain answers to ad-hoc questions fairly rapidly, assuming an appropriate view is chosen
Online Analytical Processing (OLAP)
  • An OLAP cube is a multidimensional database structured to support slicing, dicing, and drill-down
Spreadsheets vs. OLAP Cube
  • 1) A single spreadsheet
  • 2) Multiple spreadsheets across different products, selling regions and time
  • 3) An OLAP cube that allows users to drill into the data from product, region, and time dimensions.
Examples of Big Data Applications
  • Macy's Inc. and real-time pricing
  • The retailer adjusts pricing in near-real time for 73 million (!) items, based on demand and inventory, using technology from SAS Institute
  • Wal-Mart Stores Inc. and search
  • Its search engine includes semantic data, which describe the meaning of data items and their relationships
  • The platform relies on text analysis, machine learning and even synonym mining to produce relevant search results
  • Wal-Mart says adding semantic search has improved online shoppers completing a purchase by 10% to 15%. In Wal-Mart terms, that is billions of dollars
  • Source: techtarget.com
Data Mining
  • Data mining is a process that uses statistical, mathematical, artificial intelligence and other techniques to extract useful information from large databases
  • We can use data mining to answer questions such as:
  • What will customers buy?
  • What products to sell together?
  • How can a company predict which customers are at risk for churning?
  • Examples:
  • Amazon.com
  • Netflix
Common Data Mining Techniques
  • Identifying associations: Finding relationships between things that happen at the same time (e.g., basket analysis)
  • Identifying sequences: Showing the order in which things happen (e.g., path or click-stream analysis of a web site)
  • Classification: Predicting future behavior based on past data (e.g., buying a product or not)
  • Clustering: Finding groups of similar things that weren't known before (e.g., customer segments)
  • Modeling: Using data to predict important outcomes (e.g., sales)
Data Mining Technique: Decision Tree
  • Decision Tree is a classification method.
  • It is a structure that can be used to divide a large collection of records into successively smaller sets of records by applying a sequence of simple decision rules.
  • It is powerful/popular for solving business problems such as approval of loans/credit cards and targeted promotion.
  • Rules from a decision tree can be translated to English.
  • Example:
  • IF Age <=43 & Sex = ‘Male’ & Credit Card Insurance = ‘No’
  • THEN Life Insurance Promotion = ‘No’
yes
  • yes
  • yes
  • yes
  • yes
  • yes
  • yes
  • no
  • no
  • no
  • no
  • no
  • no
  • no
  • Approve the loan
  • Not approve the loan
  • Approve the loan
  • Approve the loan
  • Not approve the loan
  • Not approve the loan
  • Not approve the loan
  • Not approve the loan
  • Steady job
  • Adequate assets
  • Adequate assets
  • Adequate income
  • Adequate income
  • Adequate income
  • Adequate income
  • Root
  • Child
  • Child
  • Child
  • Child
  • Child
  • Child
  • Leaf
  • Leaf
  • Leaf
  • Leaf
  • Leaf
  • Leaf
  • Leaf
  • Leaf
  • Original Decision Tree
In reality, there would be more cases, and very likely cases that lead to contradictory rules. The idea is to generate rules that generalize to as many cases as possible. Further, the number of nodes should be minimized as far as possible.
  • In this example, even without the help of the computer, we can easily simplify the decision tree, i.e., minimize the number of nodes.
  • Root
  • Child
  • Child
  • Leaf
  • Leaf
  • Leaf
  • Leaf
  • Simplified Decision Tree
Encoding the Decision Tree
  • Two rules based on the Decision Tree:
  • IF the applicant has adequate income AND the applicant has a steady job THEN approve the loan
  • IF the applicant has adequate income AND NOT the applicant has a steady job AND the applicant has adequate assets THEN approve the loan
Utilizing the Results of Data Mining
  • Data mining results can be delivered to users in a variety of ways
Example: Amazon’s Anticipatory Shipping
  • Amazon filed a patent for the algorithm-based system that ship products before you even place an order
  • The patent summary describes a method for shipping a package of one or more items "to the destination geographical area without completely specifying the delivery address at time of shipment," with the final destination defined en route
  • This forecasting model uses data from your prior Amazon activity, including time on site, duration of views, links clicked and hovered over, shopping cart activity and wish lists.
  • When possible, the algorithm also sprinkles in real-world information gleaned from customer telephone inquiries and responses to marketing materials, among other factors.
  • Source: http://www.predictiveanalyticsworld.com/patimes/amazon-knows-what-you-want-before-you-buy-it/3185/
Evaluating Web Site Performance
  • Measure the performance of a “live” site on a regular basis
  • Establish measurable performance benchmarks
  • Evaluate actual performance against the benchmarks
  • Learn from the experience and make necessary changes
  • Revisit performance benchmarks and make changes, if necessary
  • Benchmarks
  • Performance-based goals
  • Developed by observing actual performance of similar e-businesses or reviewing industry averages
  • Typically include sales goals
  • Typically include evaluation of visitors’ actions at the Web site
Web Analytics
  • Used to help determine Web site return on investment (ROI)
  • Common metrics for measuring visitor behaviors and actions:
  • Visit or session measures continuous requests for pages by a single visitor’s Web browser for a specific period of time
  • Unique visitors measures the number of individual visitors to a site
  • Repeat visitors measures unique visitors who return to the site
  • Page views or impressions measures the number of times a specific page is viewed
  • Page views per visitor measures how deep a visitor goes into a site
Web Analytics (cont’d.)
  • Common metrics for measuring visitor behaviors and actions:
  • IP addresses identifies origin of unique visitors
  • Referring URLs indicates how visitors reached the site
  • Browser type identifies which browsers visitors are using
  • Click-stream analysis shows the path visitors take from page to page at the site
  • Conversion rate indicates the rate at which visitors become customers
  • Shopping cart abandonment indicates how many customers fail to complete their purchase
Google Analytics: Traffic Sources
  • Visitors may be referred to the site by search engines (e.g., Google, Yahoo) through either organic search results or pay-per-click ads.
  • Some visitors may come directly to the site. They either remember the URL or have the URL in their bookmarks.
  • Alternative traffic sources can be links from emails or other affiliated sites.
  • Therefore, the traffic sources can indicate which of the marketing tactics pay off.
Discussion: Shopping Cart Abandonment
  • Google Analytics In Real Life - Online Checkout
  • https://www.youtube.com/watch?v=3Sk7cOqB9Dk
  • What are the common problems with online checkout?
  • Login information forgotten
  • Excessively long agreement statement to read
  • Connection lost after idle time (e.g., 15 minutes)
  • Security checking procedures
  • Delivery options
Solution: Funnel Visualization and Goal Flow
  • Specify goals (e.g., sign-up, purchase)
  • Specify the steps a user has to go through to achieve the goals
  • Measure the drop-out rate at each step
Example of Funnel Visualization
  • The funnel visualization shows of all the visitors who started down the purchase path — viewing the cart in this case — who reached each step in the process and how many fell out of the process at each step.
  • The visitor can view other pages in between two steps, such as navigating to the site’s return policy before completing the checkout.
  • Source: https://www.smartlook.com/blog/funnel-analysis-google-analytics/
Mobile Marketing
  • Mobile marketing describes any attempt to appeal to potential customers with some sort of marketing message
  • Being the most personal form of web marketing
  • Mobile phones are with us all the time
  • Smart phones have access to our entire address book and calendar
  • Our mobile phones know what kind of entertainment we like, e.g., video, music, games
  • Also the most targeted form of web marketing
  • We can tell a lot about a person based on his/her cell phone
  • The mobile phone and the way it is used can provide powerful demographic and psychographic signals about the owner
  • People choose different carriers, handsets, or phone features because of their social and utilitarian needs
Mobile Marketing (cont’d)
  • A more immediate form of web marketing
  • People check their mobile phones often—sometimes habitually or incessantly
  • Messages are often immediately read
  • This immediacy makes mobile marketing an extraordinary marketing option for last-minute or time-sensitive calls to action
  • Examples:
  • Customized SMS or MMS promotion
  • Notifications from mobile apps
Source: Statista
  • Data on Your Smartphone
Example: Google Map Tracks Your Every Move
  • Google is tracking our every foot steps and placing a red dot on its map to keep track of users’ records
  • “You can yourself check your every move from here. You just need to log in with the same account you use on your Smartphone, that’s it. The map will display all the records of everywhere you've been for the last day to month on your screen,” Elizabeth Flux, editor of Voiceworks magazine wrote.
What Else Could Your Smartphone Know About You?
  • A study indicates that a combination of smartphone sensors can detect when a user is having a manic or depressive episode
  • Activity and location data on that smartphone could accurately predict a change in mood at the rate of 94 percent.
  • Monitoring phone calls for frequency and speed of calls bumped that accuracy up to 97 percent.
  • When users showed an increase in average activity (measured by GPS location and accelerometer speed), faster calls and more calls total, it indicated a shift into a manic episode.
  • When activity, call length and call volume regressed, it signaled a shift into depression.
  • Source: MIT Technology Review
Applications of Mobile Marketing
  • Indoor location tracking
  • The most widely used technique is to intercept WiFi signals emitted by shoppers’ smartphones
  • Allow retailers to track shoppers’ physical movements in shops/shopping malls
  • Questions that can be answered:
  • Where to move an escalator that was interfering with an entrance?
  • How long visitors stay after watching a movie?
  • Do they get one soda, hop in the car, and leave? Or are they staying longer?
  • When to send a coupon to visitors?
  • Source: MIT Technology Review
Applications of Mobile Marketing (cont’d.)
  • DensePose from WiFi
  • A neural network architecture that uses only WiFi signals from three transmitters and receivers for human dense pose estimation
  • Source: https://medium.com/syncedreview/cmus-densepose-from-wifi-an-affordable-accessible-and-secure-approach-to-human-sensing-6440173e9795
  • SMS mobile ads on trains
  • People living in cities commute 48 minutes each way on average
  • Research found that commuters in crowded subway trains are about twice as likely to respond to a mobile offer by making a purchase vis-à-vis those in non-crowded trains
  • The purchase rates measured 2.1% with fewer than two people per square meter, and increased to 4.3% with five people per square meter
  • Source: Andrews, M., Luo, X., Fang, Z., & Ghose, A. (2015). Mobile Ad Effectiveness: Hyper-Contextual Targeting with Crowdedness. Marketing Science, 35(2), 201-340.
  • Osaka: 14.2 people per square meter on Japanese train
Blockchain
  • Distributed ledgers in a peer-to-peer distributed database
  • Maintains a growing list of records and transactions shared by all
  • Encryption used to identify participants and transactions
  • Early use cases:
  • Foundation of Bitcoin and other crypto currencies (e.g., Ethereum)
  • Now used more widely for other purposes:
  • Financial transactions, supply chain, medical records, food safety, etc.
Types of Blockchain
  • Source: https://www.gemini.com/cryptopedia/blockchain-types-pow-pos-private
Bitcoin Mining as a Business
  • Computers around the world are racing to solve complex algorithms
  • A Bitcoin transaction takes about 10 minutes to an hour on average
  • Profitability:
  • Using one of the most advanced miners (Antminer S21; costs about $2800)
  • You will earn around $226 per month
  • Time for ROI is 12-13 months
  • Source: https://hashrateindex.com/blog/bitmain-antminer-s21-profitability/
Five Common Blockchain Myths
  • Source: https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/blockchain-explained-what-it-is-and-isnt-and-why-it-matters
Example: Smart Contracts for Flight Delay Insurance
  • Source: https://www.propertycasualty360.com/2020/01/24/how-blockchain-and-smart-contracts-will-disrupt-insurance/
Example: IBM Food Trust
  • Carrefour and IBM Food Trust collaborates to implement a global food traceability standard across all of the links in the chain – from producers through to sales channels
  • A collaborative blockchain network between manufacturers and distributors stores essential product safety information:
  • Traceability information about product origin and quality
  • Information about the nutritional properties of products and the potential presence of any allergens or questionable substances
  • Traceability shared across the whole supply chain in the event of a product recall, a health issue or non-compliance with specifications or a particular label
  • Other companies that have joined the network:
  • Walmart, Nestle, Dole Food, Tyson Foods, Kroger
Video Case: Glu Mobile and Stronghold
  • Glu Mobile uses the SAP Big Data Service to improve its gaming experience.
  • Stronghold is a company that provides payment and financial infrastructure, including real-time clearing, settlements, micropayments, and custom virtual payment networks.
  • Key questions to consider:
  • How are the 4 V’s relevant to the user data of Glu Mobile?
  • What are the key performance metrics that Glu Mobile should look at?
  • Can smart contracts work equally well for complex insurance products?
Summary
  • Basics of big data and its terminology
  • The concept of data mining
  • Some examples of big data applications (Web analytics, mobile marketing)
  • Blockchain and its applications
  • Exercise: Case study on Glu Mobile and Stronghold
Lecture 5

Artificial Intelligence & Machine Learning

Learning Objectives
  • A quick recap: Case study on Glu Mobile and Stronghold
  • Understand the concept of artificial intelligence (AI)
  • Understand the concept of machine learning
  • Discuss various examples of AI and machine learning
  • Discuss the future of AI
  • In-class exercises: decision tree classification
  • Weekly exercise: MNIST number recognition, video cases of VideoPeel and AI Cat Flap
Case #1: Glu Mobile
  • Which of the four V’s that define big data is the least relevant to the case of Glu Mobile?
  • Volume
  • Variety
  • Velocity
  • Veracity
  • Which of the following metrics can best indicate the extent of user engagement?
  • Number of downloads
  • Number of purchases of game items
  • Average play time per user
  • Number of updates
  • Which of the following metrics can best indicate how well the company retains users in the long run?
  • Number of downloads
  • Number of purchases of game items
  • Number of reviews in Google Play store
  • Percentage of users who install updates
Case #2: Stronghold
  • Can smart contracts work equally well for complex insurance products?
  • The line between “simple” and “complex” claims (flight vs. health insurance)
  • Verification challenges (objective vs. subjective data; automated vs. human judgement)
  • Legal and regulatory challenges (data sharing, privacy concerns)
Artificial Intelligence (AI)
  • Popular perception driven by science fiction
  • Robots good at everything except emotions, empathy, appreciation of art, culture, etc.
  • AI is a subfield of computer science
  • The goal is to make machines do things that would require intelligence if done by humans
  • General-purpose AI like the robots of science fiction is incredibly hard
  • Human brain appears to have lots of special and general functions, integrated in some amazing way that we really do not understand yet
  • Special-purpose AI is more doable
  • E.g., chess/poker/Go playing programs, logistics planning, automated translation, speech and image recognition, web search, data mining, medical diagnosis, keeping a car on the road, etc.
Definitions of AI
  • The “act rationally” approach is usually emphasized
  • It is often easier to define rational action than rational thought.
  • Rationality is more cleanly defined than human behavior, so it is a better metric.
  • When human behavior is not rational, we would prefer rationality.
  • Example: People would not want a shopping agent to recommend impulse purchases (i.e., purchase something you are not planning to).
“Chinese Room” Argument (Searle 1980)
  • Person who knows English but not Chinese sits in room and receives notes in Chinese
  • Has systematic English rule book for how to write new Chinese characters based on input Chinese characters, returns his notes
  • Person=CPU, rule book=AI program
  • Has no understanding of what they mean
  • But from the outside, the room gives perfectly reasonable answers in Chinese
  • Searle’s argument: the room has no intelligence in it!
  • More discussion available here: https://plato.stanford.edu/entries/chinese-room/
Turing Test
  • Human judge communicates with a human and a machine over text-only channel
  • Both human and machine try to act like a human
  • Judge tries to tell which is which
  • Loebner prize: the oldest Turing Test contest
  • Five-time winner (2013, 2016-19): Mitsuku (or Kuki)
  • https://chat.kuki.ai/
AI Systems Vs. Human Beings
  • AI can now support “Artistic Creativity” to a considerable extent, with the advances in computer vision and machine learning.
  • Some argued that artists cannot be replaced by machines.
Generative AI Art
  • Was once predicted to be next big thing in Non Fungible Tokens (NFT’s)
  • NFT market cap: $100 Million to $50 Billion in the last couple years
  • Coinbase has started an NFT marketplace
  • Over 95% of NFTs created in the 2021-2022 are now worthless
  • Twitter and Instagram have removed support for NFTs
  • Portrait paintings generated by a Portrait paintings generated by a Generative Adversarial Network, AI Gogh
  • Source: https://medium.datadriveninvestor.com/next-big-thing-in-nfts-generative-ai-art-c4139ba0dfae
Relationship between AI, Machine Learning, Neutral Networks, Deep Learning, and GenAI
  • Source: https://www.engenome.com/news/AI-regulation/
Three Major Types of AI Systems (1)
  • Analytical AI
  • These AI systems have only characteristics consistent with cognitive intelligence
  • They generate a cognitive representation of the world and use learning based on past experience to inform future decisions
  • Most AI systems today fall into this group, for example:
  • Fraud detection in financial services
  • Image recognition
  • Self-driving cars
Three Major Types of AI Systems (2)
  • Human-Inspired AI
  • These AI systems have elements from cognitive and emotional intelligence
  • They can understand and consider human emotions in their decision making
  • Examples:
  • Email sentiment analysis
  • Using advanced vision systems to recognize emotions like joy, surprise, and anger at the same level (and frequently better) as humans
  • Companies can use such systems to recognize emotions during customer interactions or while recruiting new employees
Three Major Types of AI Systems (3)
  • Humanized AI
  • These AI systems show characteristics of cognitive, emotional, and social intelligence
  • Such systems that are able to be self-conscious and self-aware in their interactions with others are not available yet
  • Building AI systems that actually experience the world in a fundamental way are a project for the potentially distant future
Major AI Approaches
  • Logic and rule-based approach
  • Rules typically take the form of an {IF:THEN} expression
  • Example: {IF 'red' AND 'octagon' THEN 'stop-sign'}
  • Machine learning (pattern-based approach)
  • Takes a probabilistic approach by using historical data and outcomes
  • Understands patterns and trends in historic data, and gives you the probability of different outcomes based on that
  • Machines need to be trained
  • Can adapt to changing trends
Three Broad Types of Learning (1)
  • Supervised learning
  • Map a given set of inputs into a given set of labeled outputs
  • Each data entry has been tagged with “the correct answer”
  • Example: Using a large database of labeled images to separate between those images showing a Chihuahua and those showing a muffin
Training and Evaluating a Supervised Model
  • Step 1: Training a model by making a prediction from a labeled example and updating its predictions for each labeled example in the training dataset
  • Step 2: Evaluating a model by comparing its predictions to the actual values
  • Source: https://developers.google.com/machine-learning/intro-to-ml/supervised
What is Classification?
  • Classification belongs to the category of supervised learning
  • Classes/Targets are provided with the input data
  • Classification is the process of predicting the class of given data points
  • Classes are sometimes referred to as Targets or Labels
  • Classification modeling approximates a mapping function from input variables (x) to discreate output variables (y)
  • Applications: credit approval, target marketing, etc.
General Approach to Classification
  • Training set of records with known class labels is used to build a classification model
  • Test set of previously unseen records with known class labels is used to evaluate the quality of the model
  • Training set and Test set are usually separate datasets drawn from the same original dataset; they should contain similar data
  • Finally, the classification model is applied to new records with unknown class labels
Most Popular Classification Algorithms
  • K-Nearest Neighbors
  • Source: https://www.jcchouinard.com/classification-in-machine-learning/
  • Support Vector Classification
Three Broad Types of Learning (2)
  • Unsupervised learning
  • Datasets contain the input data but no corresponding output data
  • The algorithm needs to infer the underlying structure from the data itself
  • Example: Cluster analysis – grouping elements in similar categories but neither the structure of those clusters nor their number is known in advance
  • Since the output is derived by the algorithm itself, it is not possible to assess the accuracy or correctness of the resulting output (which is often subject to human interpretation)
Two Categories of Unsupervised Learning
  • Clustering
  • Grouping data points into clusters such that the data points with most similarities belong to a group and have less or no similarities with the data points of another group
  • E.g., market segmentation – group together people with similar traits or preferences
  • Association
  • Finding relationships between variables in a dataset
  • E.g., basket analysis – people who buy item A (bread) tend to also buy item B (butter)
What is Cluster Analysis?
  • Cluster analysis is a statistical method for organizing subjects into groups (clusters) based on how closely associated the subjects are
  • Two common types of clustering algorithms:
  • K-means clustering
  • Hierarchical clustering
  • You do not know how many clusters exist in the data before running the analysis
  • K-Means Clustering
  • Hierarchical Clustering
Three Broad Types of Learning (3)
  • Reinforcement learning
  • The system receives an output variable to be maximized and a series of decisions that can be taken to impact the output
  • Examples:
  • Maximizing the score obtained in the Pac-Man game, by telling the system that Pac-Man can move up, down, left and right
  • Microsoft uses reinforcement learning to select headlines on MSN.com to maximize the number of clicks on a given link
Example: Using MTurk for Machine Learning Tasks
  • MTurk can help Data Scientists and AI Experts to gather large amounts of high-quality training data for computer vision models, which aim to locate objects in images
  • Task: Drawing a bounding box to locate the object
Neural Network
  • Neural network algorithms are one type of machine learning.
  • It is a network of processing elements (i.e., artificial “neurons”) that work in parallel to complete a task, attempt to approximate the functioning of the human brain, and can learn by example.
  • Typically, a neural network is trained by having it categorize a large database of past information (e.g., a database of handwritten digits) for common patterns, so as to infer rules.
How a Neural Network Works
  • A neural network uses rules it learns from patterns in data to construct a hidden layer of logic. The hidden layer then processes inputs, classifying them based on the experience of the model. In this example, the neural network has been trained to distinguish between valid and fraudulent credit card purchases.
Deep Learning Neural Network
  • Today's computers can handle “deep-learning” networks with dozens of layers.
Why Do AI Researchers Often Study Game Playing?
  • DeepMind achieves human-level performance on many Atari games (2015)
  • Watson defeats Jeopardy champions (2011)
  • AlphaGo defeats Go champion (2016)
  • CMU’s Libratus defeats top human poker players (2017)
Some Applications of AI in Business
  • Image related AI
  • Visual search: Find items similar to one selected by a user
  • Object detection: Find and track objects in an image
  • Text related AI
  • Data extraction: Natural language processing (NLP) extracts data from unstructured text, e.g., "My cat Peter, ran into the supermarket“ (recognize that Peter is a name and supermarket is a location)
  • Text classification: A sentence like "I really loved that bad movie" would be reasonably classified as positive
  • Document summarization: extract the most relevant sentence from a longer passage
Example: ASOS 'Style Match' Visual Recognition Tool
  • Deep Learning Algorithm for Object Detection
  • Example: ASOS ‘Style Match'
Example: Intelligent Personal Assistants
  • Computer search engine using:
  • Natural language
  • Conversational interface, verbal commands
  • Situational awareness
  • Can handle requests for appointments, flights, routes, event scheduling, and more
  • Examples:
  • Apple’s Siri
  • Google Assistant
  • Amazon Alexa
Example: AI Shapes Every Aspect of Amazon’s Business
  • Examples of AI applications:
  • Product recommendations
  • Alexa voice assistant
  • Amazon Go stores
  • Robots in warehouses
  • Inventory placement in warehouses
  • Source: https://www.fastcompany.com/90246028/how-ai-is-helping-amazon-become-a-trillion-dollar-company
Discussion: Amazon Go
  • Watch the in-class video:
  • How does the checkout process differ between Amazon Go and typical supermarkets?
  • What do you like or dislike?
  • What are the benefits to customers? To Amazon?
  • Is there a tradeoff between convenience and privacy here?
Example: AI-Powered Features in Google Sheets
  • The “Explore” feature uses NLP to turn questions about data into formulas that extract data
  • Allows you to ask a question in the everyday language of your data, e.g., “What is the sum of revenue by salesperson?” or “How much revenue does each product category generate?”
Example: Handwriting Recognition (cont’d.)
  • Input Layer: 784 (28X28) neurons
  • Hidden Layer: N hidden neurons (N depending on the algorithm)
  • Output Layer: 10 classifiers for 10 numbers
AI Applications in Banking
  • Employee surveillance (A I machines, e.g., I B M Watson)
  • Tax preparation/filing (e.g., H&R block uses I B M Watson)
  • Automated customer service; answering customer inquiries in real-time (chatbots; e.g., Olivia at HSBC)
  • Automated online support for paying bills and account inquiries using Amazon Alexa (e.g., Capital One)
  • Fraud detection and anti-money-laundering activities (e.g., Bank Danamon)
Lessons from AI Research
  • Clearly-defined tasks that require intelligence and education from humans tend to be doable for AI techniques
  • Playing chess, drawing logical inferences from clearly-stated facts, performing probability calculations in well-defined environments, etc.
  • Complex, messy, ambiguous tasks that come naturally to humans are much harder
  • Remarkable progress in recent years, especially in machine learning for narrow domains
  • Example: image recognition, speech recognition, reinforcement learning in computer games, self-driving cars
  • AI systems still lack
  • Broad understanding of the world, common sense, ability to learn from very few examples, truly out-of-the-box creativity
Predicting Future of AI
  • Current State of AI (2025)
  • Advanced Narrow AI
  • Highly specialized AI systems excelling in specific domains
  • Examples include sophisticated language models, multimodal AI, and agentic AI
  • Emerging General AI Capabilities
  • AI systems demonstrating advanced reasoning and problem-solving abilities across multiple domains
  • Examples include OpenAI’s o1 model and Elon Musk’s Grok 3
  • Near Future (2026-2030)
  • Enhanced General AI
  • AI systems approaching human-level intelligence in most domains
  • Improved contextual understanding and cognitive abilities
Predicting Future of AI (cont’d)
  • Distant Future (2030+)
  • Super AI
  • AI surpassing human intelligence in many areas
  • Potential to significantly improve quality of life
  • Emphasis on responsible development and ethical considerations
  • Artificial Consciousness
  • Remains a theoretical concept
  • No clear timeline for development
Potential Dangers of AI: Dystopia
  • Position of AI Dystopia
  • Elon Musk: “We need to be super careful with AI. Potentially more dangerous than nukes.”
  • Bill Gates: “I am in the camp that is concerned about super intelligence. Musk and some others are on this and I don’t understand why some people are not concerned.”
  • Stephen Hawking: “The development of full artificial intelligence could spell the end of the human race.”
  • What do you think?
Potential Dangers of AI: Utopia
  • Position of AI Utopia
  • AI will partner and support humans to innovate, e.g.,
  • Crime fighting
  • Prediction of the probability that a song will be a hit
  • Finding the perfect match for dating in a population of 30,000
  • Some issues related to utopia
  • People will have a problem of what to do with their free time
  • The road to AI Utopia could be rocky (impact on jobs)
  • Everything will be different – one day we will not drive anymore and there may not be human financial advisors
Decision Tree Classification
  • Decision Tree is a supervised learning method used for classification
  • It is a structure that can be used to divide a large collection of records into successively smaller sets of records by applying a sequence of simple decision rules
  • It requires little data preparation – it is able to handle both continuous (numerical) and categorical data
Some Limitations of Decision Tree
  • Problem of overfitting – an over-complex tree has poor performance on unseen data as they tend to capture the unique features of the training data
  • Mitigation: Limit the size/depth of the tree
  • Decision trees can be unstable due to small variations in data
  • Mitigation: Train multiple trees with randomly drawn samples
  • May result in biased trees if some classes (outcomes) dominate (e.g., one of the classes is observed only in 2% of all the cases)
  • Mitigation: Balance the dataset prior to fitting with a decision tree
Tools Available on the Market
  • Excel add-ins
  • XLSTAT, TreePlan, PrecisionTree, SolutionTree
  • Commercial software
  • SAS, SPSS, STATA, TreeAge
  • Open-source software
  • Weka, R, Python
The Tool We Will Use
  • An Excel spreadsheet with VBA macros that implement the C4.5 algorithm, one of the popular decision tree algorithms
  • The original spreadsheet developed by Dr. Angshuman Saha (PhD in Statistics, U of Washington)
  • It provides sufficient capability for learning and small projects – supporting up to 10,000 rows of data, 50 predictors, and categorial variables (including class variable) of up to 20 categories
  • Note: There should be no missing value in the class variable (preferably all variables)
Decision Tree Demonstration
  • A bank is now trying to classify its customers into different risk types, i.e., Low, Medium, and High
  • Generate a decision tree based on the following data:
Procedures
  • Open the Excel file “Lecture 5 - DecisionTree.xlsm” and enable macros
  • Copy the dataset in the worksheet “demo_data” to the worksheet “Data”
  • Check the settings of data types. Risk is the outcome (i.e., a class variable), whereas Debt, MaritalStatus, and Income are the inputs (i.e., categorical predictors)
  • Go to the worksheet “UserInput”
  • Look at some key settings (e.g., Leaf Node Criteria; Training/Test Set)
  • Finally, click “BuildTree”
Key Results to Look at
  • The decision tree is displayed on the worksheet “Tree”.
  • Support is the % of data falling into a particular leaf node.
  • Confidence (conf) is the accuracy of the prediction for that node.
  • You may make predictions by changing the values of the predictors on the left.
  • The composition of each node is shown on the worksheet “NodeView”.
  • The summary of results is shown on the worksheet “Result”. You may check the accuracy of the model by referring to the % of misclassified data points and the confusion matrix.
Question
  • Suppose you now have a new customer with the following attributes:
  • Debt = Low
  • Income = High
  • Marital Status = Married
  • Based on the generated rules from the decision tree, how would you classify the customer’s risk type?
Exercise 1: Targeting Personal Loan Customers
  • A bank is trying to expand its customer base and bring in more loan business
  • By converting its liability customers (depositors) to Personal Loan customers
  • The dataset from the last year campaign
  • 500 customers; 50 (10%) accepted the personal loan
Data Description
  • Class Variable
Objectives
  • Can we model the previous campaign's customer behavior to analyze what combination of parameters make a customer more likely to accept a personal loan?
  • There are several products/services the bank offers, such as mortgage, certificates of deposit (CD), securities accounts, e-banking, credit cards, etc. Can we spot any association among these for finding cross-selling opportunities?
Example: A Simple Decision Tree Based on Only Three Variables
  • no
  • yes
  • no
  • yes
  • yes
  • no
  • Do not offer the loan
  • Offer the loan
  • Do not offer the loan
  • Have credit
  • card?
  • Have
  • CD Account?
  • Use online
  • banking?
  • With a simple model like this, a lot of potential customers will be missed out.
  • Do not offer the loan
Procedures (1)
  • Copy the dataset in the worksheet “bank_data” to the worksheet “Data”
  • Select the variables to be included (note: including more variables needs longer time for processing)
  • Model 1: CD Account, Online, CreditCard
  • Model 2 (best): CCAvg, Mortgage, Income, Education
  • [Exclude CCAvg if using French OS]
  • Enable the following setting:
  • “Minimum Node Size” and “Maximum Depth” to keep the tree small
  • Construct some alternative models and compare their accuracy rates
Obtaining The Best Model by Including All Variables
  • The model is able to predict 40 out of 50 true cases of class “1”, i.e., those who accepted the loan
Procedures (2)
  • Enable the following setting:
  • “Partition Data into Training/Test set” to further validate the model
  • The goal is to obtain a model that has good predictive accuracy on both the training data and the test data
  • The model will likely change due to the random partition of the dataset for training and testing (i.e., a random dataset is used each time you train the model)
  • You should train the model multiple times to get a stable set of rules
  • For example, the following result is considered fairly good for practical use
  • No true case of “1” is missed out
  • All true cases of “1” are captured
Procedures (3)
  • Using the final model, predict which of the following customers are likely to accept the personal loan offer by entering the values of predictors onto the worksheet “Tree” (column H)
  • In usual cases, your model should indicate that the first customer is likely to accept the offer
Exercise 2: Predicting Customer Churn
  • Dataset: Telcom Customer Churn
  • The dataset contains 1000 rows (customers) and 20 columns (features)
  • The “Churn” column is our target
  • Goal: Predict churn to retain customers
  • You can analyze all relevant customer data and develop focused customer retention programs
Task – Predicting Churn
  • Copy the dataset in the worksheet “telco_data” to the worksheet “Data”
  • Select the variables to be included (note: including more variables needs longer time for processing)
  • Construct some alternative models and compare their accuracy rates; e.g., try the following sets of predictors
  • Model 1: Gender, Contract, MonthlyCharges
  • Model 2 (best): Tenure, TechSupport, Contract, TotalCharges
  • Based on the best model, provide some suggestions on the design of customer retention program:
  • Who are most likely to churn?
  • What to do with these customers?
  • Is it effective to rely on the model to target potential churning customers?
Results Based on Model 2 (Partial View)
  • Node 13 captures most cases of churning:
  • 38% of all cases fall into this node, of which 53% of these cases are churning cases
  • The overall model is able to predict 203 out of 249 churning case in the whole dataset, with a hit rate of 54% (203/378).
  • Note: You should obtain a similar model even when you partition the data into training/test set.
Suggestions for Customer Retention
  • Now we know customers with the following attributes are likely to churn:
  • Tenure < 70 months
  • TechSupport: no
  • Contract: month-to-month
  • TotalCharges>=19.05
  • Potential suggestions based on the above attributes:
  • Focus on customers with tenure less than 70 months
  • Convert the month-to-month contract to one-year contract by offering discounts
  • Bundle tech support in the contract
Video Cases for Exercise: VideoPeel and AI Cat Flap
  • VideoPeel
  • An example of using Amazon Transcribe to process consumer review videos.
  • The company uses how Amazon Transcribe and other Amazon’s AI services to extract useful information to optimize the placement of reviews and the product listing.
  • AI Cat Flap
  • An example of using Amazon DeepLens (discontinued in January 2024) to create an AI-controlled cat flap.
  • Machine learning was used to develop an in-house AI application for a programmable video camera to control the opening of a cat flap.
Summary
  • The concept of artificial intelligence (AI)
  • The concept of machine learning
  • Examples of AI and machine learning
  • Future of AI
  • Hands-on exercises on decision tree classification
  • Weekly exercise: MNIST number recognition, video cases of VideoPeel and AI Cat Flap
Lecture 6

Generative AI

Learning Objectives
  • A quick recap: MNIST example, cases of VideoPeel and AI Cat Flap
  • Understand the concept of Generative AI (GenAI)
  • Understand the nature and limitations of large language models (LLMs)
  • Describe unique features of GenAI
  • Critically reflect on the use of GenAI and its limitations in a business context
  • In-class exercise: AI-powered market entry strategy
  • Weekly exercise: video cases of Saks and “AI Tries 20 Jobs”
MNIST Example
  • Compared with the MNIST example, which of the following statement(s) is/are true if we want to use machine learning to recognize hand-written capital English alphabet (A to Z)? (choose all applicable options)
  • It will require a larger dataset to train the system to achieve the same accuracy rate.
  • The same neural network for the MNIST example can be readily used for alphabet recognition.
  • The trained system can recognize lowercase alphabet (a to z) with similar accuracy.
  • A larger neural network will be needed to recognize English alphabets.
Case: VideoPeel
  • How the Amazon’s AI technologies described in the case of VideoPeel may help YouTube increase the click rate of ads that are shown along with the videos?
  • Amazon’s AI technologies can help YouTube to know:
  • What the videos are about
  • What contents the person is watching
  • What is the good timing to display an ad (right after something is said or when a person is likely to be put into a certain mood by the contents)
Case: AI Cat Flap
  • Crowdsource the job of labelling the images to speed up data preparation
  • Not to reduce the number of conditions as it hurts the prediction accuracy
  • Another option is to use a pre-trained (existing) model to do the classification (e.g., whether it is a cat in an image) by using services on the market (e.g., IBM Watson Visual Recognition)
  • The AI application built for Metric cannot be readily used for other house cats living elsewhere
  • Need to re-train the model with a new set of images
What is Generative AI?
  • Definition
  • AI that creates original content in response to prompts or requests
  • Content can be in various forms: text, images, video, audio, software code
  • Key characteristic
  • GenAI is able to generate new and original content autonomously
  • Other AI systems primarily analyze or classify existing data
Core Technologies
  • Deep learning models
  • Advanced neural networks that can learn complex patterns from large datasets
  • Natural Language Processing (NLP)
  • Used to understand, interpret, and generate human language, crucial for text-based generative AI applications
  • Transformer architecture
  • A specific type of neural network architecture that has revolutionized natural language processing
  • The basis for many powerful generative AI models
How Generative AI works
  • Identifies patterns in large datasets
  • GenAI models analyze vast amounts of data to recognize underlying structures and relationships
  • Encodes relationships and information
  • The models create internal representations of the patterns and information they have learned
  • Uses encoded data to understand and respond to prompts
  • When given a prompt, the AI uses its learned representations to generate appropriate responses
  • Iterative refinement process
  • The models use an iterative process to gradually refine their outputs, starting from random noise and progressively improving the quality of the generated content
Example: DALL-E Prompts
  • Prompt 1: Create a sleek, minimalist logo representing Lunar Strategies, a digital marketing agency
  • Prompt 2: For an educational program for children, design a logo with a book and rocket ship
  • Source: https://mockey.ai/blog/dall-e-prompts/#50-best-dall-e-prompts-guide-with-examples-in-2024
What are Large Language Models (LLMs)?
  • Large Language Models (LLMs) are advanced AI models that understand and generate human language text.
  • They are trained on vast amounts of text data and can be fine-tuned for specific language tasks, such as language translation, text summarization, sentiment analysis, and more.
  • LLMs are specific types of models used within the field of Natural Language Processing (NLP).
Example
  • You understand the difference between these two sentences:
  • A bottle is on the table.
  • A table is on the bottle.
  • But, you may not understand the difference between these two (in Persian):
  • بطری روی میز است.
  • میز روی بطری است.
  • Because you have not learned this language.
How does NLP learn a language? (1 of 2)
  • Pre-training:
  • The model is exposed to a massive amount of text data from the Internet.
  • The model learns to predict the next word in a sentence based on the preceding words. This prediction task encourages the model to capture the statistical patterns, grammar, vocabulary, and context of the language.
  • The model becomes proficient at understanding the relationships between words, the meaning of words in context, and even some level of world knowledge from the text data it has seen.
How does NLP learn a language? (2 of 2)
  • Fine-tuning:
  • After pre-training, the model is fine-tuned for specific NLP tasks. This fine-tuning involves training the model on a narrower dataset that is carefully curated for the task at hand.
  • For example, if the task is sentiment analysis, the fine-tuning dataset will consist of labeled sentiment data.
  • Fine-tuning allows the model to generalize its pre-learned language understanding to perform effectively on real-world applications.
What is GPT?
  • Generative Pre-trained Transformer (GPT) is a popular family of LLMs developed by OpenAI. They are known for generating coherent text and use the Transformer architecture.
  • Imagine you have a friend who has read lots of books. You give your friend a partial sentence like "Once upon a _____," and your friend can guess the missing word, like "time" or "unicorn."
  • GPT does something similar but with much more knowledge and text data.
Writing Effective Prompts
  • Clear prompts help ChatGPT understand user intentions accurately, reducing the chance of misinterpretation.
  • Tips for structuring prompts:
  • Be Specific: Clearly state what you want in your prompt.
  • Use Context: Provide context if necessary, especially in multi-turn conversations.
  • Ask Open-Ended Questions: Frame questions in a way that encourages detailed responses rather than "yes" or "no" answers.
  • Experiment and Iterate: Try different phrasings and structures.
  • Avoid Redundancy: Avoid restating the same information in your prompts, as it can confuse the model.
Good vs. Bad Prompts
  • Good prompt:
  • "Please provide step-by-step instructions for creating a pivot table in Excel to analyze sales data by region and product category."
  • This prompt is clear, specific, and outlines the task, making it likely to generate a detailed and helpful response.
  • Bad prompt:
  • "Create a pivot table."
  • This prompt is vague and lacks specificity. It does not clearly convey what assistance is needed, making it less likely to produce a useful response.
Managing ChatGPT outputs (1 of 2)
  • Control response length and tone
  • “Please provide a concise yet academic response.”
  • “Please ensure the explanation is suitable for children.”
  • “Please summarize your answer within two paragraphs.”
  • Be aware of limitations
  • ChatGPT is a machine learning model and may not always provide accurate or complete information.
  • Verify information
  • Independently verify critical information, especially in professional or academic contexts. It may introduce you to some non-existing research papers.
  • Use fact-checking sites (Snopes, PolitiFact, FactCheck.org)
Managing ChatGPT outputs (2 of 2)
  • Avoid plagiarism
  • If using ChatGPT-generated content in reports or documents, provide proper attribution or citations.
  • Respect privacy
  • Do not share sensitive or personal information with ChatGPT.
  • Prevent bias
  • Be aware of potential biases in ChatGPT's responses. If the model produces biased or inappropriate information, consider adjusting your prompts.
Stop
  • Before you share, read, or act on information, pause. Ask yourself: Do I know this source?
  • Stopping helps you avoid being swept up in emotional or misleading content.
  • Investigate
  • Look into who created the content.
  • Check their authority, expertise, and possible bias.
  • Find
  • See if other reliable sources are covering the same claim.
  • If multiple credible outlets confirm it, that strengthens its validity.
  • If not, that’s a red flag.
  • Trace
  • Follow links or reverse image search to get back to the original source.
  • Often, misinformation comes from things being taken out of context, misquoted, or manipulated.
  • SIFT Method for Evaluating GenAI Output
General Use Cases for Study
  • Research assistance:
  • ChatGPT can help with literature reviews by summarizing the research papers you provided to it, brainstorming by suggest research questions, and data analysis by interpreting complex data and explaining statistical results, trends, and patterns.
  • Writing support:
  • Using ChatGPT to make your writing more coherent, concise, and clear, as well as to correct your conceptual and grammatical mistakes.
  • Language learning:
  • Practicing conversations and translations with ChatGPT.
Affordances of ChatGPT
  • “What ChatGPT allows employees to do in practice, as opposed to the designed features or capabilities of a technology. Affordances describe how users actually use a technology.” (Retkowsky et al., 2024, pp. 512-513)
  • Affordances are not just about what the technology does, but what it allows users to do based on the technology’s features and how it integrates with the user's needs, environment, and skills.
Applications of GenAI in Various Industries
  • E-commerce – Personalized shopping experiences, content creation
  • Marketing – Campaign optimization, content generation
  • Customer Service – AI-powered chatbots and virtual assistants
  • Creative Industries – Art, music, and design generation
Example: Salesforce Agentforce
  • Suite of customizable AI agents for business functions
  • Key Features:
  • Autonomous operation within defined guardrails
  • Real-time data adaptation
  • Seamless integration with human employees
  • Low-code tools for easy customization
  • Examples of Applications:
  • Customer Service: 24/7 support, case routing, query resolution
  • Sales: Lead nurturing, meeting scheduling, sales coaching
  • Marketing: Campaign optimization, content creation
  • Commerce: Personalized product recommendations, order management
Salesforce is building on
  • 10 years of AI innovation
Discussion Questions
  • What potential challenges might arise from using AI to analyze customer sentiment and suggest appropriate responses?
  • What strategies could competitors develop to differentiate themselves in a market where AI-driven customer service becomes the norm?
Objective
  • Use generative AI to develop a market entry strategy for a fictional company entering a new international market.
Steps
  • Step 1: Company & market selection.
  • Each group is assigned a fictional business.
  • Select a new international market for expansion (e.g., Japan, Brazil, Germany).
  • Step 2: AI-assisted research
  • Use generative AI tools (ChatGPT, Claude, Gemini, etc.) to gather insights on:
  • Market Trends: What are key consumer preferences in this market?
  • Competitive Landscape: Who are the main competitors?
  • Challenges & Opportunities: What are the barriers to entry and potential advantages?
  • Step 3: AI-generated marketing strategy
  • Prompt AI to generate a marketing image tailored to the new market.
  • Refine the AI’s response by asking follow-up questions or requesting different angles.
  • Be prepared to share the final image!
List of Businesses
  • "LumeAlgae": A biotechnology startup that creates bioluminescent home lighting using engineered living algae. Instead of electricity, these "living lamps" use nutrient-rich glass vessels that glow with a soft, natural neon-blue or green light.
  • "StratoStay": A high-altitude eco-tourism company that offers modular, floating hotel pods suspended by carbon-fiber balloons. These pods "drift" over scenic landscapes like the Grand Canyon or the Swiss Alps, providing 360-degree views.
  • "MorphoWeave": A high-performance "smart fashion" brand. Their garments are made from a responsive polymer that physically changes its texture and shape based on the weather—becoming waterproof and ridged during rain, or porous and mesh-like in the heat.
Reflection Questions
  • What are the strengths and weaknesses of AI-generated content?
  • What limitations does AI have?
  • How does human input improve the AI’s results?
Video Cases for Weekly Exercise: Saks and “AI Tries 20 Jobs”
  • Saks
  • Saks Fifth Avenue is an American luxury department store chain.
  • The company uses Salesforce’s Agentforce to revolutionize customer service and elevate the shopping experience.
  • “AI Tries 20 Jobs”
  • The video explores whether AI can perform various professional jobs by testing AI tools across multiple fields, including software engineering, medicine, graphic design, comedy, journalism, and more.
Summary
  • The concept of Generative AI (GenAI)
  • The nature and limitations of LLMs
  • Reflection on the use of generative AI and its limitations
  • In-class exercise: AI-powered market entry strategy
  • Weekly exercise: video cases of Saks and “AI Tries 20 Jobs”
Lecture 7

Cybersecurity

Learning Objectives
  • A quick recap: video cases of Saks and “AI Tries 20 Jobs”
  • Explain the challenges of cybersecurity
  • Explain why information systems are vulnerable
  • Describe key types of computer crimes
  • Describe various controls for securing information resources
  • Describe organizational policies and procedures for information security
  • Exercises: Cases of dark web, deep fakes, and ethical hacking
Case: Saks
  • How can luxury brands like Saks Fifth Avenue maintain a balance between high-tech solutions and the personal touch that customers expect?
  • Some considerations:
  • Augmentation vs. replacement
  • Routine vs. non-routine tasks
  • Omnichannel experience
Case: “AI Tries 20 Jobs”
  • Among the professions discussed in the video, which one do you think is most at risk of being replaced by AI, and why?
Most Common Passwords
  • Password:
  • 123456
  • admin
  • 12345678
  • 123456789
  • 1234
  • 12345
  • password
  • Aa123456
  • 1234567890
  • Time to crack:
  • Less than one second for all of those listed here!
  • Source: https://www.rd.com/article/passwords-hackers-guess-first/
  • Are these passwords better?
  • password123
  • aa12345678
  • @pple
  • !2345
How are Passwords Stored?
  • Passwords are usually stored in an encrypted (hashed) format in companies’ databases
  • “GoodMorning”  deaead14fbeb3c7d4bf835032bb63543
  • This conversion is one-way and cannot be reverse engineered
  • The simplest way to crack a hash is to guess the password and generate the hash, then compare it to the actual hashed value
  • Leaked password hashes are shared on the file-sharing websites and dark web
  • Hackers get these password hashes and extract real passwords to compromise online accounts (e.g., Facebook, online banking, etc.)
  • One note on password encryption and storage:
  • For some applications (e.g., saved passwords in Chrome/Google Cloud), the encryption of passwords is made reversible by design for user convenience (e.g., to allow users to view saved passwords)
How are Passwords Cracked?
  • Hashcat is one of the fastest password recovery tool that offers multiple attack modes, e.g.,
  • Brute-force
  • Dictionary
  • Combinator
  • Toggle-Case
  • Cracking speed (by brute-force) can be improved by using high-end graphics cards (i.e., millions-billions of guesses per second)
  • An 8-character password can be cracked in 48 minutes with eight RTX4090 GPUs
  • Still, a15-character password (with 95^15 combinations) is technically impossible to crack – it takes about 6 billion years with the same setup
Uber Data Breach: MFA Fatigue Attack
  • The hacker purchased an employee’s stolen credentials from the dark web
  • The hacker attempted to log in but was stopped by multi-factor authentication (MFA)
  • The hacker bombarded the employee with numerous MFA requests for an hour
  • The hacker then pretended to be from Uber’s IT team and asked the employee to accept the push notification
  • Finally, the hacker gained access to Uber’s intranet and sensitive information
  • Source: https://blogs.manageengine.com/desktop-mobile/desktopcentral/2022/09/21/uber-data-breach-2022-how-the-hacker-annoyed-his-way-into-the-network-and-our-learnings.html
E-Business Security Environment
  • Scope of the problem
  • Overall size of and losses due to cybercrime unclear
  • Global cybercrime costs is expected to reach $15.6 trillion by 2029
  • Reports by security product providers indicate increasing cybercrimes
  • Online credit card fraud is one of the most high-profile forms
  • Underground economy marketplaces sell stolen information, malware and more
  • Source: https://www.statista.com/forecasts/1280009/cost-cybercrime-worldwide
Dark Web
  • A part of the Internet that is not indexed by search engines
  • Requires special Web browsers, e.g., TOR, to access
  • Activities are anonymous
  • Encrypted traffic goes through multiple TOR-relays (servers) to hide IP address, data, and browsing history
  • Giving rise to illegal activities, e.g., selling credit card numbers, drugs, guns, hacked Netflix accounts, etc.
  • Not everything is illegal, e.g., “Black Book” (the Facebook of TOR)
In the dark web, people can provide or buy a series of services related to cybercrime, e.g., identify the valuable targets for attack, provide fake information to mislead targets, provide money-laundering network to clean the illegal money, etc.
  • Any people or businesses can purchase cyberattack services to target systems with something of value to steal or disrupt.
What is Good Cybersecurity?
  • How to achieve highest degree of security
  • New technologies
  • Organizational policies and procedures
  • Industry standards and government laws
  • Other factors
  • Time value of money (protect data for a few hours or days?)
  • Cost of security versus potential loss
  • Security often breaks at weakest link
System Vulnerability (1)
  • Why systems are vulnerable
  • Accessibility of networks
  • Hardware problems (breakdowns, configuration errors, damage from improper use or crime)
  • Software problems (programming errors, installation errors, unauthorized changes)
  • Disasters
  • Use of networks/computers outside of firm’s control
  • Loss and theft of portable devices
The architecture of a Web-based application typically includes a Web client, a server, and corporate information systems linked to databases. Each of these components presents security challenges and vulnerabilities. Floods, fires, power failures, and other electrical problems can cause disruptions at any point in the network.
  • SECURITY CHALLENGES AND VULNERABILITIES
System Vulnerability (2)
  • Internet vulnerabilities
  • Network open to anyone
  • Size of Internet means abuses can have wide impact
  • Use of fixed Internet addresses with cable/DSL modems creates fixed targets for hackers
  • Unencrypted VOIP
  • E-mail, Peer-to-Peer Sharing, Instant Messaging
  • Interception
  • Attachments with malicious software
  • Transmitting trade secrets
System Vulnerability (3)
  • Software vulnerabilities
  • Commercial software contains flaws that create security vulnerabilities
  • Hidden bugs (program code defects)
  • Zero defects cannot be achieved because complete testing is not possible with large programs
  • Flaws can open security holes to intruders
  • Patches
  • Small pieces of software to repair flaws
  • Exploits often created faster than patches can be released and implemented
Example: MacOS High Sierra Root Login Bug
  • The bug makes it possible for anyone to log into a locked Apple device using just the username "root“ (without a password).
  • This means anybody with physical access to your MacOS High Sierra device can log into your computer, no matter how secure your passwords are.
  • Source: https://www.cnet.com/news/apple-flaw-allows-macos-high-sierra-logins-without-passwords/
System Vulnerability (4)
  • Wireless security challenges
  • Radio frequency bands easy to scan
  • SSIDs (service set identifiers)
  • Identify access points
  • Broadcasted multiple times
  • Can be identified by sniffer programs
  • Wardriving
  • The act of searching for Wi-Fi wireless networks by a person in a moving vehicle, using a portable device (e.g., laptop, smartphone)
  • Eavesdroppers drive by buildings and try to detect SSID and gain access to network and resources
  • Once access point is breached, intruder can access networked drives and files
Many Wi-Fi networks can be penetrated easily by intruders using sniffer programs to obtain an address to access the resources of a network without authorization.
  • WI-FI SECURITY CHALLENGES
  • Sniffer programs
  • Monitor information traveling over network
  • Enable hackers to steal proprietary information such as login, e-mail, company files, etc.
Zero-day Vulnerability
  • Zero-day vulnerability
  • It is a previously undisclosed computer software vulnerability that hackers can exploit
  • Once the flaw becomes known, the software's author has zero days in which to plan and advise any mitigation against its exploitation
  • Price range for zero-day exploits: $60,000 (Adobe Reader) up to $2,500,000 (Apple iOS)
  • Project Zero (Google)
  • Bugs found by the Project Zero team are reported to the manufacturer and only made publicly visible:
  • once a patch has been released;
  • or if 90 days have passed without a patch being released
  • The 90-day-deadline is Google's way of implementing responsible disclosure
  • Some flexibility is allowed to balance rapid patch development with responsible disclosure (e.g., 7-day disclosure policy for actively exploited vulnerabilities)
  • Recent development: Use of AI agents for identifying security vulnerabilities in codes
Example 1: WebKit Zero-day on iPhones and Macs
  • On Feb 13, 2023, Apple released a security update to address a zero-day vulnerability used in attacks to hack iPhones, iPads, and Macs
  • It is a WebKit confusion issue that could be exploited to trigger OS crashes and gain code execution on compromised device
  • Attackers can execute arbitrary code on devices running vulnerable OS after opening a malicious web page
  • Source: https://www.bleepingcomputer.com/news/security/apple-fixes-new-webkit-zero-day-exploited-to-hack-iphones-macs/
Example 2: Follina Exploit in MS Office
  • Exploitation process:
  • Attacker crafts malicious Microsoft Office document
  • Document uses Word’s remote template feature to retrieve HTML file
  • HTML file executes PowerShell code through MSDT
  • Arbitrary code execution occurs with calling application’s privileges
  • Characteristics:
  • No macros required
  • Works even in Protected View
  • Exploitable via .doc and .rtf files
  • Consequences:
  • Allows attackers to install programs
  • View, modify, or delete data
  • Create new user accounts
  • Achieve persistent access to victim systems
How Follina Works
  • Source: https://youtu.be/NQKLWhvRQDE?si=xQhK38DTOs_GG7vz
What is Computer Crime?
  • “Using a computer to commit an illegal act”
  • Targeting a computer while committing an offense
  • Unauthorized access of a server to destroy data
  • Using a computer to commit an offense
  • Using a computer to embezzle funds
  • Using computers to support criminal activity
  • Maintaining books for illegal gambling on a computer
Types of Computer Crimes
  • Unauthorized Access
  • Stealing information
  • Stealing use of computer resources
  • Accessing systems with the intent to commit information modification
  • Information Modification
  • Changing data for financial gain (e.g., embezzlement)
  • Defacing a Web site (e.g., hacktivists making a statement)
  • An information modification attack
Who Commits Computer Crimes?
  • Computer criminals come in all shapes and sizes, in order of infractions they are:
  • Current or former employees; most organizations report insider abuses as their most common crime
  • People with technical knowledge who commit business or information sabotage for personal gain
  • Career criminals who use computers to assist in crimes
  • Outside crackers—commit millions of intrusions per year
Insider Threats
  • Unauthorized access can occur in many ways
  • Some are based on insider threats
  • Disgruntled employees, former employees, contractors
  • Example: Edward Snowden
  • He was an American computer professional who once worked for the CIA and was a contract worker for the National Security Agency (NSA).
  • He disclosed thousands of classified documents to several media outlets.
How Do They Do It?
  • Technology-based approaches
  • Vulnerability scanners
  • Packet sniffers
  • Keyloggers
  • Brute force
  • Exploiting human weaknesses
  • Phishing
  • Social engineering (misrepresenting oneself to trick others into revealing information)
  • Shoulder surfing
  • Dumpster diving
Computer Viruses and Malwares
  • Viruses
  • Rogue software program that attaches itself to other software programs or data files in order to be executed
  • Can damage your hardware, software or files
  • Trojan horses
  • Appear to be useful software but will actually do damage
  • Open a backdoor on your computer that gives malicious users access to your system
  • Spyware
  • Software that monitors the activity on a computer, such as the Web sites visible or even the keystrokes of the user
  • Other types: reset browser home page; redirect search requests; slow computer performance by taking up memory
  • They are spread by:
  • Downloads on Web sites and social networks
  • Drive-by (unintentional) downloads, e.g., clicking on a link emailed to you
  • E-mail, Instant Messaging attachments
Example 1: Denial-of-Service Attack
  • Denial-of-service attack (DoS)
  • Flooding server with thousands of false requests to crash the network
  • Distributed denial-of-service attack (DDoS)
  • Use of numerous computers to launch a DoS
  • Botnets
  • Networks of “zombie” PCs infiltrated by bot malware
  • Deliver 90% of world spam, 80% of world malware
  • Necurs: the largest spam botnet controlling millions of computers
  • DoS attack can also be done by a single computer
  • Example: BlackNurse attack requires just a laptop and a bandwidth of 20 Mbps to bring down firewalls and servers
Example 2: Phishing Attack
  • Typical steps:
  • A programmer writes a phishing attack template and sells it.
  • A phisher purchases the template and designs the attack.
  • The phisher contracts with a cracker to host the phishing Web site.
  • The phisher contacts a bot herder to sent the botnets.
  • The phisher sends the information attained to a collector.
  • The collector works with a mule herder to withdraw funds from banks.
Example 3: Mobile Ransomware
  • Fusob is one of the popular mobile Trojans
  • Users in more than 100 countries worldwide were attacked by this Trojan-Ransom program.
  • This family is mainly spread via porn sites; its representatives usually appearing under the name xxxPlayer and mimicking a multimedia player application used for watching porn videos.
  • The criminals usually demand between $100 and $200 to unblock the device. The ransom has to be paid in the form of codes from pre-paid iTunes cards.
Common Controls to Achieve Security
  • Physical access restrictions
  • Firewalls
  • Encryption
  • Virus monitoring and prevention
  • Secure data centers
  • Systems development controls
  • Human controls
Physical Access Restrictions
  • Physical access controls typically focus on authentication
  • Something you have
  • Keys
  • Smart cards
  • Something you know
  • Password
  • PIN code
  • Something you are
  • Biometrics
  • Identification via fingerprints, retinal patterns in the eye, facial features, or other bodily characteristics
Example: Stealing Fingerprints Using Photos
  • New research warns that hackers could copy fingerprints from high-resolution photographs
  • E.g., Jan Krissler, a German hacker, used high resolution photos, including one from a government press office, to successfully recreate the fingerprints of Germany’s defense minister
  • Source: https://interestingengineering.com/fingerprints-can-be-stolen-from-photos-research-suggests
Firewalls
  • Firewall — part of a computer system designed to detect intrusion and prevent unauthorized access to or from a private network
  • Filter traffic
  • Incoming and/or outgoing traffic
  • Filter based on traffic type
  • Filter based on traffic source
  • Filter based on traffic destination
  • Filter based on combinations of parameters
Encryption and VPN
  • Encryption: Transforming text or data into cipher text that cannot be read by unintended recipients
Virus Monitoring and Prevention
  • Standard precautions
  • Purchase, install, and maintain antivirus software
  • Do not use flash drives or shareware from unknown or suspect sources
  • Use reputable sources when downloading material from the Internet
  • Delete without opening any e-mail message received from an unknown source
  • Do not blindly open e-mail attachments, even if they come from a known source
  • If your computer system is infected by a virus, report it
Secure Data Centers
  • Securing the facility’s infrastructure
  • Site selection
  • Physical access restrictions
  • Intrusion detection
  • Uninterruptible power supply
  • Protection from environmental threats
South Korean Government Data Disaster: Critical Lesson of Backups
  • On September 26, 2025, a catastrophic battery fire at South Korea’s National Information Resources Service datacenter destroyed 858 terabytes of government data, with zero backups in place for the main G-Drive system
  • G-Drive, used primarily by the Ministry of Personnel Management and other agencies, was not backed up due to “large capacity,” despite backup policies for most other systems
  • About 17% of central government officials—eight years of data—were affected, disrupting email, websites, emergency services, and more
  • As of October 4, only 17.8% of affected services had been restored; full recovery is expected to take about a month
  • Source: https://www.tomshardware.com/pc-components/storage/south-korean-government-learns-the-importance-of-backups-the-hard-way-after-catastrophic-fire-858-terabytes-of-data-goes-up-in-magic-smoke
Other Controls
  • System development controls
  • Technological controls related to ensuring that all systems (including hardware and software) are properly developed, acquired, and maintained
  • Human controls
  • Educating potential users and enacting laws can help, but unethical users will undoubtedly always remain a problem for those wanting to maintain IS security
Organizational Policies and Procedures
  • Information policy – outlines how sensitive information will be handled, stored, transmitted, and destroyed
  • Security policy – explains technical controls on all organizational computer systems, such as access limitations, audit-control software, firewalls, and so on
  • Use policy – outlines the organization’s policy regarding appropriate use of in-house computer systems
  • Backup policy – explains requirements for backing up information
  • Account management policy – lists procedures for adding/removing users
  • Incident handling procedures – list procedures to follow when handling a security breach
  • Disaster recovery plan – lists all the steps an organization will take to restore computer operations in case of a disaster
Monitoring Security
  • Organizations should continuously monitor the effectiveness of the controls
  • The most intensive monitoring efforts should be focused on high-risk systems
  • External entity reviews the controls to uncover any potential problems
  • E.g., auditing computer activity logs, using AI for cybersecurity
  • Organizations should also monitor external events
  • Gather news on cyber security threats: e.g., Information Sharing and Analysis Centers, United States Computer Emergency Readiness Team
  • Responding to security incidents: e.g., problems discovered by Project Zero (Google)
Class Discussion: Scenario Analysis
  • Scenario:
  • A medium-sized healthcare provider discovers that an employee’s laptop containing unencrypted patient data has been stolen from their car.
  • The laptop was not protected with disk encryption or a strong password.
  • The data includes names, addresses, social security numbers, and medical histories of approximately 5,000 patients.
Discussion Questions
  • How could this incident have been prevented?
  • What are the immediate risks and potential impacts of this incident?
  • What steps should the healthcare provider take in the first 24 hours after discovering the theft?
Video Cases: Dark Web, Deepfakes, Ethical Hacking
  • Dark Web
  • How the dark web works and how it becomes the platform for illegal activities
  • Deepfakes
  • An example of misuse of AI – creation of fake contents for frauds or scams
  • How companies are dealing with this problem
  • Ethical hacking
  • How hackers earn money legally
  • What bug bounty programs are
Summary
  • The challenges of cybersecurity
  • Different system vulnerabilities
  • Different types of computer crimes
  • Controls for securing information resources
  • Organizational policies and procedures for information security
Lecture 8

IT & Sustainability

Learning Objectives
  • A quick recap: dark web, deepfakes, ethical hacking
  • Explain the concept of sustainable development
  • Explain the concept of green computing
  • Explain the positive and negative roles of AI in sustainability, and the concept of trustworthy AI
  • Explain the concept of information privacy, fair information practices, and GDPR
  • Exercises: AI for agriculture, facial recognition
Case of Dark Web
  • MyEssec portal is an example of:
  • Surface web
  • Deep web
  • Dark web
  • The Tor browser provides a similar level of protection and performance (e.g., speed) as a VPN.
  • True
  • False
Case of Deepfakes
  • What of sort of counterfeit contents can be created by using deepfake technology? (choose all applicable options)
  • Voice
  • Image
  • Video
  • Text (hardest to detect; e.g., textfakes on Twitter, Facebook, etc.)
  • Which of the following is currently the best solution to deal with deepfakes on social media platforms?
  • Using AI to distinguish between authentic videos and deepfakes
  • Enforcing legislations to prohibit the sharing of deepfakes
  • Making people aware that videos can be manufactured and manipulated by using deepfake technology
Case of Deepfakes (cont’d)
  • Apart from AI, blockchain is a possible solution to the problem of deepfakes, based on the idea that every original video can be identified by a unique fingerprint stored on the blockchain (i.e., viewers can check the originality of a video on the blockchain). Do you think this idea is applicable to the cases of YouTube or Facebook? What would be the advantages and disadvantages?
  • It depends on the types of video:
  • For the videos for specific purposes (e.g., police), the blockchain solution may be useful and worthwhile.
  • For everyday sharing like those on YouTube, it may be less relevant (at least not for every video).
Case of Ethical Hacking
  • Traditionally, cybersecurity is managed in-house. Why do more companies start to adopt bug bounty programs recently? (choose all applicable options)
  • Companies can get access to a large pool of highly skilled hackers.
  • The internal security team cannot identify all vulnerabilities in a system.
  • Hackers who work for bug bounty programs are motivated to find and report every single bug in a system. (NOT TRUE)
  • The cost of data breaches is much higher than the spending on bug bounty programs.
  • What are the potential downsides of bug bounty programs?
  • Attracting the attention of both black-hat and white-hat hackers (i.e., difficult to differentiate between an actual attack and an attack from a legitimate bug hunter)
  • Rewarding “bad” behaviors (i.e., hackers withhold their knowledge of a vulnerability until the company pays; potentially give rise to grey-hat hackers)
  • Bug hunters are incentivized to spend as little time as possible per bug
What is Sustainable Development?
  • Sustainable development
  • Development that meets the needs of the present without compromising the ability of future generations to meet their own needs
  • The United Nations have defined 17 Sustainable Development Goals (SDGs) in three categories:
  • Society
  • Economy
  • Environment
Green Computing
  • Practices and technologies for designing, making, using, and disposing of computer hardware to reduce environmental impact
  • Green use: Minimizing the electricity consumption of computers and their peripheral devices and using them in an eco-friendly manner
  • Green disposal: Repurposing existing equipment or appropriately disposing of, or recycling, unwanted electronic equipment
  • Green design: Designing energy-efficient computers, servers, printers, projectors and other digital devices
  • Green manufacturing: Minimizing waste during the manufacturing of computers and other subsystems to reduce the environmental impact of these activities
  • Source: https://www.techopedia.com/definition/14753/green-computing
Old and Dirty Energy Drives Global Internet Growth
  • Internet central to all aspects of modern society
  • By 2030, more than 29 billion IoT connected devices (Statista)
  • Smartphones, connected vehicles, smart grid, asset tracking, etc.
  • Will require massive amounts of energy
  • Coal is a cheap energy source but not clean
  • Apple is most aggressive clean energy user (Greenpeace report)
  • Other clean energy users include Facebook, Google, Amazon, and Microsoft
Data Centers
  • Big IT companies (e.g., Google, Facebook, Amazon, etc.) have large amounts of data to be managed
  • Data centers require dedicated physical space for infrastructure and much electricity to operate
  • The centralization of data center facilitates
  • Management
  • Repairs
  • Upgrades
  • Security
  • Novel approaches using natural cooling
  • Google server facility in Finland
  • Facebook server in Lulea, Sweden
  • How are Google’s data centers powered sustainably?
Mining Cryptocurrency is Power-Hungry
  • Bitcoin, by design, is literally anti-efficient
  • Computers are constantly working
  • People connect more miners to the network to increase profits
  • Bitcoin uses more electricity annually than the whole of Argentina, analysis by Cambridge University suggests
  • It consumes around 121.36 terawatt-hours (TWh) a year - and is unlikely to fall unless the value of the currency slumps
  • Critics say electric-car firm Tesla's decision to invest heavily in Bitcoin undermines its environmental image
  • Some suggested that a carbon tax on cryptocurrencies could be introduced
  • Source: https://www.bbc.com/news/technology-56012952
Energy Consumption of Blockchain Technology
  • Sedlmeir, J., Buhl, H.U., Fridgen, G. et al. The Energy Consumption of Blockchain Technology: Beyond Myth. Bus Inf Syst Eng 62, 599–608 (2020). https://doi.org/10.1007/s12599-020-00656-x
  • The Proof-of-Work (PoW) algorithm of blockchain requires the nodes on the network to compete against each other to complete the transaction on the network, leading to higher energy consumption than traditional centralized systems.
  • Firms will need to find the best compromise between performance, security, and energy consumption.
Increasing Power Demands of Machine Learning
  • From 1959 to 2012, the amount of computing power required by the technology doubled every two years (following Moore’s law).
  • From 2012 onwards, the computing power required for today’s machine-learning systems has been doubling every 3.4 months.
  • Example: NVIDIA developed a massive natural-language model that was 24 times bigger than its predecessor and yet was only 34% better at its learning task.
  • Sources: https://openai.com/blog/ai-and-compute/#addendum; https://www.theguardian.com/commentisfree/2019/nov/16/can-planet-afford-exorbitant-power-demands-of-machine-learning
ChatGPT Uses 17,000 Times More Electricity Than a US Household
  • ChatGPT is using up more than half a million kilowatt-hours of electricity to respond to some 200 million requests a day
  • A single GPT query consumes 15x more energy than a Google search query
  • Significantly more electricity usage with the growth of generative AI
  • Example: if Google/Bing integrates generative AI into every search
  • Source: https://www.businessinsider.com/chatgpt-uses-17-thousand-times-more-electricity-than-us-household-2024-3
How does IT Contribute to Sustainability?
  • Some innovations covered in this course:
  • Sharing economy
  • Cloud computing
  • Big data
  • Blockchain
  • Artificial intelligence (AI)
  • These innovations affect different aspects of people’s lives and contribute to social good:
  • Health, food safety, energy, public transportation, etc.
AI as the Major Enabler of Sustainability
  • Some applications of AI for social good:
  • Identifying and quantifying online abuse against women on Twitter
  • Automated monitoring of viral cassava disease (for agriculture)
  • Using satellite imagery to identify burned-down villages in conflict zones in Darfur
  • Climate informatics for climate action
Example: Using Satellite Imagery to Predict Poverty
  • A project of Stanford University combines machine learning with high-resolution satellite imagery (daytime and nighttime images) and survey data on village-level wealth to predict the distribution of poverty.
Some Examples of “AI Gone Wrong”
  • Racism in the algorithm used by US hospitals to allocate healthcare – black people receive less care than white people
  • Bias against women in Amazon’s resume filtering tool – due to the dominance of male across the tech industry
  • Gender and race bias in face detection – AI services of IBM, Microsoft, and Face++ show lower accuracy for darker people, particularly darker female
  • Racial bias in face recognition algorithms – Asian and African American people were up to 100 times more likely to be misidentified than white men
  • Source: https://www.xyonix.com/blog/how-to-detect-and-mitigate-harmful-societal-bias-in-your-organizations-ai
Principles for Successful Applications of AI for Social Good
  • Set expectations of what is possible with AI
  • Consider the value of simple solutions (e.g., visualization) – not necessarily complex machine learning (ML) methods
  • Ensure fairness of AI applications – whether there are societal biases reflected in the data for ML
  • Goals and use cases should be clear and well-defined – provide the correct metric for measuring the desired effect
  • AI solutions should aim to be cost-effective – skills-based volunteering and grants (e.g., Google AI Impact Challenge)
  • Process data securely, with respect for human rights and privacy
  • Tomašev, Nenad, et al. "AI for social good: unlocking the opportunity for positive impact." Nature Communications 11.1 (2020): 1-6.
Ethics Guidelines for Trustworthy AI in the European Union
  • Human agency and oversight – fundamental rights, human agency and human oversight
  • Technical robustness and safety – resilience to attack and security, fall back plan and general safety, accuracy, reliability and reproducibility
  • Privacy and data governance – respect for privacy, quality and integrity of data, and access to data
  • Transparency – traceability, explainability and communication
  • Diversity, non-discrimination and fairness – avoidance of unfair bias, accessibility and universal design, and stakeholder participation
  • Societal and environmental wellbeing – sustainability and environmental friendliness, social impact, society and democracy
  • Accountability – auditability, minimisation and reporting of negative impact, trade-offs and redress
  • Source: https://ec.europa.eu/futurium/en/ai-alliance-consultation/guidelines/1.html
Information Privacy
  • How much control do you have over your data?
Discussion: Quotes from Tim Cook
  • “A few years ago, users of Internet services began to realize that when an online service is free, you’re not the customer. You’re the product. But at Apple, we believe a great customer experience shouldn’t come at the expense of your privacy.”
  • “We don’t build a profile based on your email content or web browsing habits to sell to advertisers. We don’t "monetize" the information you store on your iPhone or in iCloud. And we don’t read your email or your messages to get information to market to you.”
  • What are the problems Tim Cook implied? What are the companies he might refer to?
Privacy and Information Rights
  • Privacy
  • Moral right of individuals to be left alone, free from surveillance or interference from other individuals, organizations, or state
  • Information privacy: 4 premises
  • Right to control information collected about individuals
  • “Right to be forgotten”
  • Right to know when information is collected and give consent
  • “Informed consent”
  • Right to personal information due process
  • Individuals’ legal rights must be respected
  • Right to have personal information stored in a secure manner
Information Collected by Websites (1)
  • How information about individuals is collected
  • Registration
  • Users provide names, addresses, phone numbers, email addresses, hobbies, etc.
  • Cookies
  • Allow websites to collect a user’s preferences, interests, and surfing patterns
  • Web beacons (Web bugs)
  • Tiny graphics embedded in e-mail messages and Web pages
  • Monitor who is reading e-mail message or visiting site
  • Spyware
  • All unwanted software programs designed to steal proprietary information, or that target data stores containing confidential information
  • E.g., A keystroke logger runs in the background of a user’s computer and records every keystroke the user makes
Information Collected by Websites (2)
  • Data collected includes
  • Personally identifiable information (PII)
  • Any data that can be used to identify, locate, or contact an individual
  • Anonymous information
  • Person not identified by name, only assigned code
  • Could include demographic information such as age, occupation, income, zip code, as well as behavioral data such as browsing behavior
Information Collected by Websites (3)
  • Types of personal data collected
  • Name, address, phone, e-mail, social security
  • Bank and credit accounts, gender, age, occupation, education
  • Preference data, transaction data, clickstream data, browser type
Key Issues in Online Privacy of Consumers
  • Top concerns
  • Profiling and ad targeting
  • Social network privacy
  • Sharing of information by marketers
  • Mobile phone privacy
  • Digital assistant privacy
Marketing: Profiling, Behavioral Targeting, and Retargeting (1)
  • Profiling
  • Creation of data records that characterize online individual and group behavior
  • Anonymous profiles – without real names, photos, or other identifiable information
  • Personal profiles – with known identity
  • Facial recognition as a new means of profiling
  • Advertising networks
  • Track consumer and browsing behavior on Web
  • Dynamically adjust what a user sees on screen
  • Build and refresh profiles of consumers
Marketing: Profiling, Behavioral Targeting, and Retargeting (2)
  • Business perspective:
  • Increases effectiveness of advertising, subsidizes content
  • Enables sensing of demand for new products
  • Critics’ perspective:
  • Undermines expectation of anonymity and privacy
  • Enables price discrimination
Social Networks: Privacy and Self-Revelation
  • Social networks
  • Encourage sharing personal details
  • Pose unique challenge to maintaining privacy
  • Facebook
  • Massive database
  • Serving ads to users not on Facebook
  • Sharing information with third parties
  • Conflict of interest:
  • Personal control over personal information vs. organization’s desire to monetize social network
Example: Maps of Facebook Networks and Your Romantic Relationship
  • A study found that the total number of mutual friends two people share — embeddedness — is actually a fairly weak indicator of romantic relationships.
  • “A spouse or romantic partner is a bridge between a person’s different social worlds,” Mr. Kleinberg explained.
  • The dispersion algorithm was able to correctly identify a user’s spouse 60 percent of the time, or better than a 1-in-2 chance.
  • A couple in a declared relationship and without a high dispersion on the site are 50 percent more likely to break up over the next two months than a couple with a high dispersion, the researchers found.
Mobile Devices: Privacy Issues
  • Mobile apps
  • Funnel personal information to mobile advertisers for targeting ads
  • Track and store user locations
  • Track users’ use of other apps
  • Persistent location tracking
  • U.S. Supreme Court rules that police need warrant prior to searching a cell phone for information
  • E.g., Apple refused FBI’s request to unlock a criminal’s iPhone
Privacy Rights and Protection
  • Two key options for consumers:
  • Opt-out
  • Business practice that gives consumers the opportunity to refuse sharing information about themselves
  • E.g., While the default option is to let a firm send you advertising emails, you can uncheck a box on the settings page to withdraw your consent
  • Opt-in
  • Agreement that requires users to take specific steps to allow the collection of personal information
  • E.g., Users click the “agree” button to accept cookies when visiting a website
Fair Information Practices (FIP)
  • Set of principles governing the use of information
  • Basis of most U.S. and European privacy laws
  • Restated and extended by Federal Trade Commission (FTC) in 2009 to provide guidelines for behavioral targeting
  • Used to drive changes in privacy legislation
  • COPPA, Gramm-Leach-Bliley Act, HIPAA
FIP Principles
  • Notice/awareness (core principle)
  • Web sites must disclose practices before collecting data.
  • Choice/consent (core principle)
  • Consumers must be able to choose how information is used for secondary purposes.
  • Access/participation
  • Consumers must be able to review, contest accuracy of personal data.
  • Security
  • Data collectors must take steps to ensure accuracy, security of personal data.
  • Enforcement
  • There must be mechanisms to enforce FIP principles.
Who Holds the Responsibility?
  • U.S. businesses use safe harbor framework
  • Self-regulating policy and enforcement that meets objectives of government legislation but does not involve government regulation or enforcement
  • U.S. businesses are allowed to gather transaction information and use this for other marketing purposes
  • Online industry promotes self-regulation over privacy legislation
  • However, extent of responsibility taken varies:
  • Complex/ambiguous privacy statements
  • Opt-out models selected over Opt-in
  • Online “seals” of privacy principles (e.g., TRUSTe certifying Web sites adhering to certain privacy principles)
European Directive on Data Protection
  • Companies must inform people information is collected and disclose how it is stored and used
  • Requires informed consent of customer
  • EU member nations cannot transfer personal data to countries without similar privacy protection
  • All companies operating in Europe are required to comply with user consent rules
  • Example:
  • An Austrian law student was able to get a full copy of his personal information from Facebook’s Dublin office, due to the more stringent consumer privacy protections in Ireland. The full document was 1,222 pages long and covered three years of activity on the site, including deleted Wall posts and messages with sensitive personal information and deleted e-mail addresses.
General Data Protection Regulation in the EU
  • Source: https://www.cyber-duck.co.uk/insights/introducing-gdpr-the-basics-of-the-new-data-protection-regulation
Technical Solutions for Individuals
  • Anonymous browsing tools (e.g., VPN)
  • Cookie prevention and management
  • Browser features
  • “Private” browsing
  • “Do not track” feature
  • For the most part, these solutions fail to prevent users from being tracked from site to site
Video Cases: AI for Agriculture, Facial Recognition
  • AI for Agriculture
  • An example of using AI to make agriculture more sustainable
  • How IBM implemented data analytics and AI solutions to help farmers produce more food with the same land
  • Facial Recognition
  • An example that explains the ethical aspect of AI use
  • How facial recognition may lead to invasion of personal privacy
Lecture 9

Course Wrap-Up

General Questions on AI (1)
  • Computers are supposed to work objectively as they do not have any emotion. Yet, the use of AI sometimes leads to biased decisions, such as discrimination against women and minorities. Which of the following statements are correct? (choose all applicable options)
  • AI algorithms are often trained with previous data that may consist of biased decisions.
  • In real world, AI can be made completely unbiased. (NOT TRUE)
  • Certain groups of people may be under-represented in the training data.
  • Designers unknowingly introduce biases to the AI model.
General Questions on AI (2)
  • The prevalence of AI applications may compromise personal privacy. Which of the following statements are correct? (choose all applicable options)
  • People are often unaware how much data their devices generate, process, or share.
  • AI can be used to identify and track individuals across multiple devices, even if their personal data is anonymized.
  • People often have the opportunity to make an informed consent to share their data for AI applications. (NOT TRUE)
  • To predict sensitive information (e.g., emotional states), AI always requires sensitive personal data as the input. (NOT TRUE)
Case: AI for Agriculture
  • What are the potential disadvantages of using AI in agriculture?
  • Inequality due to unequal access to such technology
  • Lack of access for poor farmers
  • High operating and maintenance costs
  • Job loss due to automation
Case: Facial Recognition
  • What are the ethical concerns about facial recognition technology?
  • Abuse of facial recognition technology for mass surveillance
  • Racial discrimination/profiling
  • Data security and identity risks
Course Wrap-up
  • Topics:
  • Introduction to e-business (+sharing economy)
  • E-business revenue models
  • E-business infrastructure (cloud computing)
  • Big data and blockchain
  • Introduction to Artificial intelligence
  • Generative AI
  • Cybersecurity
  • IT and sustainability (+information privacy)
Role of IS/IT in Value Chain Analysis
  • You should now be able to identify opportunities where information systems can be used to gain a competitive advantage
Lecture 1 – Introduction to E-Business
  • Definitions of e-commerce and e-business
  • Key drivers and barriers to e-commerce/e-business adoption by organizations and consumers
  • Value chain analysis for identifying e-business opportunities
  • Sharing economy
E-Commerce and E-Business
  • Electronic commerce
  • Buying and selling of products using the Internet
  • Not only financial transactions between organizations and customers
  • All electronically mediated transactions between an organization and third parties
  • All pre-sale and post-sale activities across the supply chain
  • Electronic business
  • A broader term
  • Transformation of key business processes (e.g., research & development, marketing, manufacturing, logistics, etc.) using IT
What is Sharing Economy?
  • The sharing economy is defined as an economic system in which assets and services are shared between private individuals
  • Based on pooling and exchanging services, resources, goods, time, knowledge, skills, etc.
  • Also known as the collaborative economy
  • Examples: Uber, Airbnb
  • Key concepts: three key differentiation strategies
  • Technology
  • Partnership
  • User experience
Lecture 2 – E-Business Revenue Models
  • Various web revenue models (web catalog, digital content, advertising-supported, advertising-subscription mixed, fee-based)
  • Revenue strategy issues (channel conflict, showrooming)
  • Value Proposition Canvas and Business Model Canvas
Revenue Models for Online Business
  • Web business revenue-generating models
  • Web catalog
  • Digital content
  • Advertising-supported
  • Advertising-subscription mixed
  • Fee-based
  • Same model can work for both sale types
  • Business-to-Consumer (B2C)
  • Business-to-Business (B2B)
How Freeconomics Works
  • Freeconomics is the leveraging of digital technologies to provide free goods and services to customers as a business strategy for gaining a competitive advantage
  • Two key considerations:
  • Marginal costs for digital services (e.g., storage price, reproduction costs) decrease tremendously over years
  • Revenue per user increases as there are more ways to exploit consumers (e.g., targeted ads)
Lecture 3 – E-Business Infrastructure
  • Web infrastructure basics
  • Infrastructure management issues
  • Cloud computing
  • Case: Commonwealth Bank of Australia
  • Exercise: case study on Google Cloud Platform and Riot Games
Infrastructure Management Issues
  • Storage needs
  • Demand fluctuations
  • Scalability
  • Cost management
Cloud Computing Platform
  • In cloud computing, hardware and software capabilities are provided as services over the Internet. Businesses and employees have access to applications and IT infrastructure anywhere at any time using an Internet-connected device.
Cloud Computing Service Models
  • Software as a Service (SaaS)
  • Hosts preinstalled applications which users just buy access to
  • Platform as a Service (PaaS)
  • Hosts an environment which programs can be executed on
  • Infrastructure as a Service (IaaS)
  • Hosts virtual machines which the customer installs an operating system on
Lecture 4 – Big Data and Blockchain
  • Basics of big data and its terminology
  • The concept of data mining
  • Some examples of big data applications (Web analytics, mobile marketing)
  • Blockchain and its applications
  • Exercise: case study on Glu Mobile and Stronghold
Five Common Blockchain Myths
  • Source: https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/blockchain-explained-what-it-is-and-isnt-and-why-it-matters
Lecture 5 – Introduction to AI
  • The concept of artificial intelligence (AI)
  • The concept of machine learning
  • Examples of AI and machine learning
  • Future of AI
  • In-class exercises: decision tree classification
  • Exercise: MNIST number recognition, video cases of VideoPeel and AI Cat Flap
AI Systems Vs. Human Beings
  • AI can now support “Artistic Creativity” to a considerable extent, with the advances in computer vision and machine learning.
  • Some argued that artists cannot be replaced by machines.
Relationship between AI, Machine Learning, Neutral Networks, Deep Learning, and GenAI
  • Source: https://www.engenome.com/news/AI-regulation/
Lessons from AI Research
  • Clearly-defined tasks that require intelligence and education from humans tend to be doable for AI techniques
  • Playing chess, drawing logical inferences from clearly-stated facts, performing probability calculations in well-defined environments, etc.
  • Complex, messy, ambiguous tasks that come naturally to humans are much harder
  • Remarkable progress in recent years, especially in machine learning for narrow domains
  • Example: image recognition, speech recognition, reinforcement learning in computer games, self-driving cars
  • AI systems still lack
  • Broad understanding of the world, common sense, ability to learn from very few examples, truly out-of-the-box creativity
Lecture 6 – Generative AI
  • The concept of Generative AI (GenAI)
  • The nature and limitations of large language models (LLMs)
  • Unique features of GenAI
  • The use of GenAI and its limitations in a business context
  • In-class exercise: AI-powered market entry strategy
  • Exercise: video cases of Saks and “AI Tries 20 Jobs”
What is Generative AI?
  • Definition
  • AI that creates original content in response to prompts or requests
  • Content can be in various forms: text, images, video, audio, software code
  • Key characteristic
  • GenAI is able to generate new and original content autonomously
  • Other AI systems primarily analyze or classify existing data
What are Large Language Models (LLMs)?
  • Large Language Models (LLMs) are advanced AI models that understand and generate human language text.
  • They are trained on vast amounts of text data and can be fine-tuned for specific language tasks, such as language translation, text summarization, sentiment analysis, and more.
  • LLMs are specific types of models used within the field of Natural Language Processing (NLP).
Lecture 7 – Cybersecurity
  • The challenges of cybersecurity
  • Different system vulnerabilities
  • Different types of computer crimes
  • Controls for securing information resources
  • Organizational policies and procedures for information security
In the dark web, people can provide or buy a series of services related to cybercrime, e.g., identify the valuable targets for attack, provide fake information to mislead targets, provide money-laundering network to clean the illegal money, etc.
  • Any people or businesses can purchase cyberattack services to target systems with something of value to steal or disrupt.
What is Good Cybersecurity?
  • How to achieve highest degree of security
  • New technologies
  • Organizational policies and procedures
  • Industry standards and government laws
  • Other factors
  • Time value of money (protect data for a few hours or days?)
  • Cost of security versus potential loss
  • Security often breaks at weakest link
Common Controls to Achieve Security
  • Physical access restrictions
  • Firewalls
  • Encryption
  • Virus monitoring and prevention
  • Secure data centers
  • Systems development controls
  • Human controls
Lecture 8 – IT and Sustainability
  • Sustainable development
  • Green computing
  • Roles of AI in sustainability
  • Information privacy
How does IT Contribute to Sustainability?
  • Some innovations covered in this course:
  • Sharing economy
  • Cloud computing
  • Big data
  • Blockchain
  • Artificial intelligence (AI)
  • These innovations affect different aspects of people’s lives and contribute to social good:
  • Health, food safety, energy, public transportation, etc.
FIP Principles
  • Notice/awareness (core principle)
  • Web sites must disclose practices before collecting data.
  • Choice/consent (core principle)
  • Consumers must be able to choose how information is used for secondary purposes.
  • Access/participation
  • Consumers must be able to review, contest accuracy of personal data.
  • Security
  • Data collectors must take steps to ensure accuracy, security of personal data.
  • Enforcement
  • There must be mechanisms to enforce FIP principles.
Final Quiz Information
  • Coverage – materials covered in all lectures (Lectures 1-8), including lecture slides and cases
  • The quiz consists of 40 multiple-choice questions
  • These questions will test your understanding of key concepts
  • The format is similar to that of the weekly quizzes
  • One correct option for each question
  • No score deduction for wrong answers
  • Duration of the quiz is 60 minutes
  • The quiz is closed-book (no access to lecture slides, cheat sheets, nor any other materials)
Advices for Preparation
  • Review the lecture slides
  • Review the cases and examples
  • A common question from students: Do we have to memorize everything on the slides?
  • ANSWER: You will not be tested on subtle details (e.g., statistical figures), but you do need to know the key concepts covered in each lecture.