讲解SQL 数据库、辅导讲解SQL 语句、辅导SQL produce 辅导Python编程|辅导Python程序

The below table contains a record of every swipe that occurs on Tinder. It has the id of the user who swipes, the if of the user who is swiped on, the time the swipe occurred, and whether the swipe was a right swipe or a left swipe.

Table name: ‘swipes’
swiping_user_id (integer) swiped_on_user_id (integer) swipe_at (timestamp) swipe_type
(varchar)
1958315 7247259 2015-03-15 00:01:05 ‘right’
5823050 8732832 2014-06-10 05:12:05 ‘left’
7948392 6767291 2015-08-10 12:45:01 ‘left’
… … …

1) Write a query that lists all users who have both swiped right and swiped left. (relevant time period: all time)

select distinct a.swiping_user_id as swiping_user_id
from swipes a, swipes b
where a.swiping_user_id=b.swiping_user_id
and a.swipe_type!=b.swipe_type
order by a.swiping_user_id desc;

2) Write a query that determines the % of right swipes per day that resulted in a match. A match occurs when user A right swipes on user B, and user B right swipes on user A. (relevant time period: all time)

select
cast(c.swipe_at as date) as swipe_day,
count(distinct a.swiping_user_id, a.swiped_on_user_id) / count(c.swipe_type) as percentage
from swipes a, swipes b, swipes c
where c.swipe_type = 'right'
and a.swiping_user_id = b.swiped_on_user_id
and b.swiping_user_id = a.swiped_on_user_id
and a.swipe_type = b.swipe_type
and a.swipe_type = 'right'
and c.swipe_type = 'right'
A/B Test Question

The table below shows results from an A/B/C test of a product feature on a continuous metric.

Mean Standard Deviation % Lift
(relative to control) N
Control 7 4 ﹣ 12,000
Variant 1 6.7 3 -4.2% 12,000
Variant 2 7.1 6 1.4% 12,000

3) In 2-3 sentences, describe the results of this test and the recommendation you would make to the product manager. Make any necessary assumptions but please state those assumptions.

#assumption: the higher score the better
#assumption: our sample size is larger than required to prove two variants are statistically different

Variant 2 has higher mean and are more spread out. The lift rate indicated variant 2 has better performance of our targeting model. I would recommend pursue variant 2, meanwhile working on lower the standard deviation.
4) The Product team is looking for interesting insights into user swipe behavior. Can you think of an interesting analysis you can do with the data from the swipes table above and how it could inform the product? Please provide the corresponding query to pull the information (if possible).

One thing I would like to improve base on our existing data is app recommendation time. Instead of sending out push notification at noon, we selectively send a notification to users not using the app for more than 24 hours and at his/her most likely hour of using the app.
。。。sql？，average？
Question 1:
Given a table with following schema: create table test_a(id integer, test_name varchar(100));
1. How can you select all the records where the 'id' column is an even number? All the odd number records?

Select * From test_a
Where test_a.id%2=0 --even number
Select * From test_a
Where test_a.id%2=1 -- odd number

2. There are 2 ways you could remove all of the rows from the table test_a. 'DELETE FROM TEST_A' and 'TRUNCATE TABLE TEST_A'. Which statement would you use? Is there a difference between the 'DELETE' and 'TRUNCATE' statements?

Depends on quantity and what kind of access do I have.

TRUNCATE is a DDL command, executed using a table lock. It removes data by deallocating the data pages used to store the data. It will minimal logging in the transaction log, so it's fast. However, it cannot be used with indexed views.

Delete is a DML command, executed using a row lock. It maintains the log, so it's slower compared to TRUNCATE. It can be used with indexed views.

3. What is the difference between the WHERE and HAVING clauses?

WHERE clause is used for filtering rows while HAVING clause is used to filter groups. Additionally, WHERE is used before the GROUP BY clause, and HAVING is used after the GROUP BY clause.

4. Write a sql statement that would determine if table 'test_a' has duplicate records.

SELECT * FROM test_a
GROUP BY id, test_name
HAVING COUNT (*) > 1

5. Assuming table 'test_a' has duplicate records, can the duplicate records be removed using a single 'DELETE' statement? If so, write the sql statement. If not, write the sql to remove the duplicate rows.

WITH test_distinct AS
(
SELECT *, ROW_NUMBER () Over(Partition BY id ORDER BY id) AS RowNumber
FROM test_a
)
DELETE FROM test_distinct WHERE RowNumber > 1

Question 2:
Three tables need to be joined to produce an aggregated result.

The first table is the events one. Every few seconds, a user clicks on a link somewhere and we receive an associated event log with some metadata.
The columns we receive on each events are:
● timestamp
● event_type (one of 'search', 'click', 'land')
● network (one of 'google' or 'yahoo': the network the event belongs to.)
● partner
● device_type
● domain
● tag

For each network (google or yahoo) in the events table, we have a corresponding report from the network itself. Each event uses (domain or tag) along with device_type to identify the corresponding row in the network table. Each event only matches to one network.
See the below tables.

The second table is the google_revenue one.
The columns we receive from google are:
● day
● domain
● device_type
● network_reported_searches (an integer type)
● network_reported_clicks (an integer type)
● network_revenue (a decimal type)

The third table is the yahoo_revenue one.
The columns we receive from yahoo are:
● day
● tag
● device_type
● network_reported_searches (an integer type)
● network_reported_clicks (an integer type)
● revenue (a decimal type)

The result table will have aggregated results for each day, partner, and network.
The columns we want are:
● day
● partner (from the events table)
● network (from the events table)
● count_search_events (an integer type)
● count_clicks_events (an integer type)
● sum_network_reported_searches, (an integer type)
● sum_network_reported_clicks, (an integer type)
● sum_network_revenue, (a decimal type)

Provide the SQL joining the 3 tables to produce the final result.

(
select google_revenue.day, events.partner, evets.network, count_search_events(
select count(event_type) from events where event_type='search'), count_clicks_events(select count(event_type) into google from events where event_type='click'),google_network_reported_searches as sum_network_reported_searches, google_network_reported_clicks as sum_network_reported_clicks,google_network_revenue as sum_network_revenue from google_revenue
join events on google.domain=events.domain
and google.device_type=evenets.device_type
) full join
(
select yahoo_revenue.day, events.partner, evets.network, count_search_events(
select count(event_type) from events where event_type='search'), count_clicks_events(select count(event_type) into yahoo from events where event_type='click'),yahoo_network_reported_searches as sum_network_reported_searches, yahoo_network_reported_clicks as sum_network_reported_clicks,yahoo_network_revenue as sum_network_revenue from yahoo_revenue
join events on yahoo.tag=events.tag
and yahoo.device_type=evenets.device_type
) on google.day=yahoo.day
group by day