微软2017年预科生计划在线编程笔试第二场总结

07:32 Apr.09-2017

昨天参加了微软实习招聘的笔试，感受颇深。

首先，感觉微软对于算法的要求确实比较高。总共四道题算法题，每一道题都需要比较巧妙的思想。而我当时一道题都没有完全做出来（惭愧）。

下面简要记录一下我对于这四道题的想法。

题目1：Queen Attack
题目2 Diligent Robots

题目1：Queen Attack

时间限制:10000ms

单点时限:1000ms

内存限制:256MB

描述

There are N queens in an infinite chessboard. We say two queens may attack each other if they are in the same vertical line, horizontal line or diagonal line even if there are other queens sitting between them.

Now given the positions of the queens, find out how many pairs may attack each other?

输入

The first line contains an integer N.

Then N lines follow. Each line contains 2 integers Ri and Ci indicating there is a queen in the Ri-th row and Ci-th column.

No two queens share the same position.

For 80% of the data, 1 <= N <= 1000

For 100% of the data, 1 <= N <= 100000, 0 <= Ri, Ci <= 1000000000

输出

One integer, the number of pairs may attack each other.

样例输入

样例输出

思路

思路1：brute force[$O(n^2)$ 80%]

由于这次比赛是可以得部分分的，因此我在面对这道题的时候为了节省时间直接采用了最暴力的方法，枚举每一对皇后，检查是否能相互攻击，时间复杂度为O($N^2$)，最终通过了80%的测试用例。代码如下：

#include <vector>
#include <iostream>
using namespace std;

class point {
public:
    int x, y;
    point(int _x, int _y) : x(_x), y(_y) {};
};

bool canAttack(point A, point B) {
    return (A.x == B.x || A.y == B.y || A.x + A.y == B.x + B.y || A.x - A.y == B.x - B.y);
}

int main(int argc, const char * argv[]) {
    int N;
    cin >> N;
    vector<point> v;
    for (int i = 0; i < N; i ++) {
        int x, y;
        cin >> x >> y;
        v.push_back(point(x, y));
    }
    int ans = 0;
    for (int i = 0; i < N; i ++) {
        for (int j = i+1; j < N; j ++) {
            if (canAttack(v[i], v[j])) {
                ans ++;
            }
        }
    }
    cout << ans << endl;
    return 0;
}

思路2：排序[O(nlgn), AC]

思路1慢就慢在需要枚举每一对皇后，但是实际上我们可以把攻击条件分为四种：位于相同的行，位于相同的列，坐标和相等，坐标差相等。拆分以后我们可以知道如下性质：

在某个攻击条件下，如果A，B能相互攻击，B，C不能相互攻击，则A，C不能相互攻击。这是显然的，如对于行攻击，如果A，B位于同一行，B，C位于不用行，则A，C位于不同行。

若A，B在某个攻击条件下能相互攻击，则在别的攻击条件下一定不能相互攻击。这是因为若A，B能在不少于两个攻击条件下相互攻击，则可知A，B位于相同的位置，而题目中强调了这种情况不可能出现。

由以上两条性质，我们就可以得到如下思路：

对于每一个攻击条件，将所有皇后所形成的集合划分为$S={s_i}$，集合$s_i$内的皇后都能互相攻击，不同集合间的皇后不能互相攻击（由性质1保证总能实现这样的划分），则在该攻击条件下，能相互攻击的皇后个数为$\sum_{i} C_{|s_i|}^2$，最终将四种攻击条件所得到的结果相加即为最终结果（性质2）。

具体实现方式（以行攻击为例，其余攻击方式同理）

将所有皇后按照行排序，然后遍历所有皇后，统计每行有多少个皇后（设为$n_i$），位于该行的皇后能相互攻击，位于不同行的皇后在行攻击的条件下不能相互攻击，因此可以进行行攻击的皇后对数为$\sum_{i}C_{n_i}^2$。

该思路的时间复杂度为$O(n\ lgn)$，AC，代码如下（稍微写的有点啰嗦，虽然可以通过函数指针的方式来压缩代码，但是这将大大影响代码可读性，故放弃）：

// 省略include

class point {
public:
    int x, y, sum, diff;
    point(int _x, int _y) : x(_x), y(_y) {
        sum = x + y;
        diff = x - y;
    };
};

struct {
    bool operator() (const point& p1, const point& p2) const {
        return p1.x < p2.x;
    }
}  xCmp;

struct {
    bool operator() (const point& p1, const point& p2) const {
        return p1.y < p2.y;
    }
}  yCmp;

struct {
    bool operator() (const point& p1, const point& p2) const {
        return p1.sum < p2.sum;
    }
}  sumCmp;

struct {
    bool operator() (const point& p1, const point& p2) const {
        return p1.diff < p2.diff;
    }
}  diffCmp;

int main(int argc, const char * argv[]) {
    int N;
    cin >> N;
    vector<point> v;
    for (int i = 0; i < N; i ++) {
        int x, y;
        cin >> x >> y;
        v.push_back(point(x, y));
    }

    int ans = 0;

    sort(v.begin(), v.end(), xCmp);
    int nSame = 0;
    for (int i = 1; i < N; i ++) {
        if (v[i].x == v[i-1].x) {
            nSame ++;
            ans += nSame;
        }
        else {
            nSame = 0;
        }
    }

    sort(v.begin(), v.end(), yCmp);
    nSame = 0;
    for (int i = 1; i < N; i ++) {
        if (v[i].y == v[i-1].y) {
            nSame ++;
            ans += nSame;
        }
        else {
            nSame = 0;
        }
    }

    sort(v.begin(), v.end(), sumCmp);
    nSame = 0;
    for (int i = 1; i < N; i ++) {
        if (v[i].sum == v[i-1].sum) {
            nSame ++;
            ans += nSame;
        }
        else {
            nSame = 0;
        }
    }

    sort(v.begin(), v.end(), diffCmp);
    nSame = 0;
    for (int i = 1; i < N; i ++) {
        if (v[i].diff == v[i-1].diff) {
            nSame ++;
            ans += nSame;
        }
        else {
            nSame = 0;
        }
    }

    cout << ans << endl;
    return 0;
}

思路3：计数排序[O(n), AC]

这个思路是和室友在讨论时获得的，既然我们只需要统计每个攻击条件下某个位置的皇后数，那完全没必要使用$nlgn$级别的排序，直接使用计数排序就好了呀（当然，由于坐标范围比较大，因此在具体实现时需要采用unordered_map来计数）。

由于unrodered_map的插入和访问时间复杂度均为$O(1)$，因此该思路的时间复杂对为$O(n)$。代码如下（同样的，虽然代码仍然可以继续简化，但为了可读性我放弃了）：

// 省略include和point的定义
int main(int argc, const char * argv[]) {
    int N;
    cin >> N;
    vector<point> v;
    for (int i = 0; i < N; i ++) {
        int x, y;
        cin >> x >> y;
        v.push_back(point(x, y));
    }

    int ans = 0;
  	// 统计每种攻击条件下位于"相同位置"的皇后的数量。
    unordered_map<int, int> countx;
    unordered_map<int, int> county;
    unordered_map<int, int> countsum;
    unordered_map<int, int> countdiff;
    for (int i = 0; i < N; i ++) {
        countx[v[i].x] ++;
        county[v[i].y] ++;
        countsum[v[i].sum] ++;
        countdiff[v[i].diff] ++;
    }
  	// 计算行攻击的皇后对数
    for (auto cx: countx) {
        int c = cx.second;
        ans += (c * (c-1)) / 2;
    }
  	// 列
    for (auto cx: county) {
        int c = cx.second;
        ans += (c * (c-1)) / 2;
    }
  	// 坐标和
    for (auto cx: countsum) {
        int c = cx.second;
        ans += (c * (c-1)) / 2;
    }
  	// 坐标差
    for (auto cx: countdiff) {
        int c = cx.second;
        ans += (c * (c-1)) / 2;
    }

    cout << ans << endl;
    return 0;
}

题目2 Diligent Robots

时间限制:10000ms

单点时限:1000ms

内存限制:256MB

描述

There are N jobs to be finished. It takes a robot 1 hour to finish one job.

At the beginning you have only one robot. Luckily a robot may build more robots identical to itself. It takes a robot Q hours to build another robot.

So what is the minimum number of hours to finish N jobs?

Note two or more robots working on the same job or building the same robot won’t accelerate the progress.

输入

The first line contains 2 integers, N and Q.

For 70% of the data, 1 <= N <= 1000000

For 100% of the data, 1 <= N <= 1000000000000, 1 <= Q <= 1000

输出

The minimum number of hours.

样例输入

10 1

样例输出

思路

思路1：枚举复制了多少个机器人[O(n) 80%]

首先我们可以知道机器人复制一定发生在开始工作之前，即应该是复制完所有机器人以后才开始工作的（当然，当复制最后几个机器人时未参与复制的那些机器人可以先开始工作）。

设在总过程中共有x个机器人参与工作，即复制了x-1个机器人，则我们可以知道复制周期为k=ceil(log2(x))。故花费在复制上面的时间为t1=Q*k，由于最后一个复制周期中可能存在机器人不参与复制，因此这部分机器人将完成一部分的工作，剩余的工作量为N1=N-(pow(2, k) - x) * Q。故完成所有工作的总时间为t=t1 + ceil(float(N1)/x)。

因此我们就可以从小到大开始枚举x，然后计算所需时间$t(x)$，直至找到一个最优解。

下面我们需要解决的问题就是x的枚举上界。显然，x不应该大于N，其次，我们可以看出来总时间应该是随x先下降后增加的。因此我们可以在遍历x的过程中查看时间代价是否比复制x-1个机器人多，若多，则说明已经开始上升了，最优解不可能存在于x之后，不必继续遍历。

基于以上观察，可写出代码如下（通过了80%的样例，另外20%的样例超时）：

#include <iostream>
#include <cmath>
using namespace std;
#define INT_MAX 1000000000000
int main(int argc, const char * argv[]) {
    long long N, Q;
    cin >> N >> Q;
    long long ans = INT_MAX;
    long long t_before = INT_MAX;
    for (int x = 1; x <= N; x ++) {
        long long k = ceil(log2(float(x)));
        long long t1 = Q * k;
        long long N1 = N - (pow(2, (float)k) - x) * Q;
        long long t = t1 + ceil(float(N1)/x);
        if (t > t_before) {
            break;
        }
        ans = min(ans, t);
        t_before = t;
    }
    cout << ans << endl;
    return 0;
}

思路2：只允许全部复制[$O(lgn)$ AC]

思路1中比较麻烦的一点就是可能存在部分复制，即允许在最后一个复制周期中存在机器人不参与复制。如果能去除掉这种可能就好了。

幸运的是，通过证明我们可以发现，部分复制不可能是最优解，因此我们可以大胆的将之抛弃，代码如下：

#include <cmath>
#include <cstdio>
#include <iostream>
using namespace std;
#define INF 1000000000000

int main() {
    long long N;
    int Q;
    cin >> N >> Q;
    long long t = INF;
    long long x = 1;
    long long count = 0;
    while (x < N) {
        t = min(t, (long long)ceil((float(N)/x)) + count * Q);
        count ++;
        x <<= 1;
    }
    cout << t << endl;
}

证明过程见下（若对证明部分不感兴趣可忽略）：

显然，对于全复制方案，复制k轮后机器人总数为$2^k$。

设某部分复制方案最终有x个机器人参与工作，则$\exists k \in N, 2^k < x < 2^{k+1}$。我们可以证明一个很强的结论：部分复制不可能比两端的全复制更优。即$t(x) < t(2^k), t(x) < t(2^{k+1})$不可能同时成立。

由思路1我们可以知道：

$t(2^k) = kQ + \lceil \frac{N}{2^k} \rceil$
$t(2^{k+1}) = (k+1)Q + \lceil \frac{N}{2^{k+1}} \rceil$
$t(x) = (k+1)Q + \lceil \frac{N - (2^{k+1} - x)Q}{x} \rceil$

又 $t(x) < t(2^k) $

$\Leftrightarrow kQ + \lceil \frac{N}{2^k} \rceil - ((k+1)Q + \lceil \frac{N - (2^{k+1} - x)Q}{x} \rceil) > 0$

$\Leftrightarrow \lceil \frac{N}{2^k} \rceil > \lceil \frac{xQ + N - (2^{k+1} - x)Q}{x} \rceil$

$\Leftrightarrow \frac{N}{2^k} > \frac{N - 2^{k+1}Q + xQ}{x}$

$\Leftrightarrow N > 2^{k+1}Q$

另一方面 $t(x) < t(2^{k+1}) $

$\Leftrightarrow (k+1)Q + \lceil \frac{N}{2^{k+1}} \rceil - ((k+1)Q + \lceil \frac{N - (2^{k+1} - x)Q}{x} \rceil) > 0$

$\Leftrightarrow \lceil \frac{N}{2^{k+1}} \rceil > \lceil \frac{N - (2^{k+1} - x)Q}{x} \rceil $

$\Leftrightarrow \frac{N}{2^{k+1}} > \frac{N - (2^{k+1} - x)Q}{x}$

$\Leftrightarrow N < 2^{k+1}Q$

$N > 2^{k+1}Q$ 和$N > 2^{k+1}Q$为两个矛盾的条件，故$t(x) < t(2^k) $和$t(x) < t(2^{k+1}) $不可能同时成立。

证毕。

思路三：直接求解[O(1) AC]

其实，既然最优解不可能来自部分复制，那剩余的可能性就不多了，我们是否可以直接求出最优解呢？幸运的是，可以的，证明如下：

由于$t(2^k) = kQ + \lceil \frac{N}{2^k} \rceil$，故对于最优解，有$t(2^{k-1}) \leq t(2^k) \leq t(2^{k+1})$ 或 $k = 0$。若后者成立，则代表最优解发生在不发生任何复制时，这时候有$Q \geq N$。若前者成立，则 $t(2^{k} \leq t(2^{k+1}) \Leftrightarrow \frac{N}{2^{k+1}} + Q - \frac{N}{2^{k}} \geq 0 \Leftrightarrow k+1 \geq log_2{\frac{N}{Q}}$

$t(2^{k-1} \leq t(2^{k}) \Leftrightarrow k \leq log_2{\frac{N}{Q}}$

因此$k=\lfloor log_2{\frac{N}{Q}} \rfloor$

代码如下：

#include <iostream>
#include <cmath>
using namespace std;

int main()
{
    double N, Q;
    cin >> N >> Q;
    if(Q >= N)
    {
        cout << N;
        return 0;
    }
    int k = floor(log2(N / Q));
    int ans = k * Q + ceil(N / pow(2, k));
    cout << ans;
    return 0;
}

题目3：A Box of Coins

时间限制:10000ms

单点时限:1000ms

内存限制:256MB

描述

Little Hi has a box which consists of 2xN cells as illustrated below.

+----+----+----+----+----+----+
| A1 | A2 | A3 | A4 | .. | AN |
+----+----+----+----+----+----+
| B1 | B2 | B3 | B4 | .. | BN |
+----+----+----+----+----+----+

There are some coins in each cell. For the first row the amounts of coins are A1, A2, … AN and for the second row the amounts are B1, B2, … BN.

Each second Little Hi can pick one coin from a cell and put it into an adjacent cell. (Two cells are adjacent if they share a common side. For example, A1 and A2 are adjacent; A1 and B1 are adjacent; A1 and B2 are not adjacent.)

Now Little Hi wants that each cell has equal number of coins by moving the coins. He wants to know the minimum number of seconds he needs to accomplish it.

输入

The first line contains an integer, N. 2 <= N <= 100000

Then follows N lines. Each line contains 2 integers Ai and Bi. (0 <= Ai, Bi <= 2000000000)

It is guaranteed that the total amount of coins in the box is no more than 2000000000 and is a multiple of 2N.

输出

The minimum number of seconds.

样例输入

2
3 4
6 7

样例输出

思路

思路1：简化问题[错误]

考试的时候，看到这道题我就想起来以前在刘汝佳的《算法竞赛入门经典训练指南》里看到的那道分金币题。

那道题的设定是一个环，相邻的人可以互给金币，求最少可以转移多少金币使得每个人手中的金币数量相等。最终使用求中位数的方式在$O(n)$的时间复杂度下解决。

由于是考试，没有太多时间让我去慢慢证明这两道题的设定是等价的，于是我就先把代码写出来提交上去试试，结果发现不等价，当时没有进一步的思路，于是放下了这道题，去做下一道题。(错误的代码就不再贴了)

思路2：贪心[O(n), AC]

这是在考完后和室友讨论时室友提出的思路，把一列先当成一个整体，则列与列之间需要传递的金币数是可以直接算出来的，而每一列的两行之间也只需相互弥补即可。实现的过程中需要注意的是ans可能会大于$\text{INT_MAX}$，因此ans也需要使用long long储存。

代码如下：

#include <iostream>
#include <vector>
using namespace std;

int main()
{
    long long N, ans = 0;
    long long avg = 0;
    cin >> N;
    vector<vector<long long> > tab(N, vector<long long>(2, 0));
    for(int i = 0; i < N; i++)
    {
        cin >> tab[i][0] >> tab[i][1];
        avg += tab[i][0] + tab[i][1];
    }
    avg /= 2 * N;
    for(int i = 0; i < N - 1; i++)
    {
        long long col = tab[i][0] + tab[i][1];
        ans += abs(col - 2 * avg);
        if(col > 2 * avg)
        {
            if(tab[i][0] >= avg && tab[i][1] >= avg)
            {
                tab[i + 1][0] += tab[i][0] - avg;
                tab[i + 1][1] += tab[i][1] - avg;
            }
            else
            {
                int j = tab[i][0] >= avg ? 0 : 1;
                ans += avg - tab[i][1 - j];
                tab[i + 1][j] += col - 2 * avg;
            }
        }
        else if(col <= 2 * avg)
        {
            if(tab[i][0] <= avg && tab[i][1] <= avg)
            {
                tab[i + 1][0] += tab[i][0] - avg;
                tab[i + 1][1] += tab[i][1] - avg;
            }
            else
            {
                int j = tab[i][0] <= avg ? 0 : 1;
                ans += tab[i][1 - j] - avg;
                tab[i + 1][j] += col - 2 * avg;
            }
        }
    }
    ans += abs(tab[N - 1][0] - avg);
    cout << ans;
    return 0;
}

题目4： EL SUENO

时间限制:10000ms

单点时限:1000ms

内存限制:256MB

描述

In a video game, Little Hi is going to assassinate the leader of an evil group, EL SUENO.

There are N high-value targets in the group, numbered from 1 to N. Each target has another target as his direct superior except for EL SUENO who has no superior. So the superior-subordinate hierarchy forms a tree-like structure. EL SUENO is at the root.

The cost of assassinating the i-th target is Ci. Before assassinating a target Little Hi has to obtain enough information about him. For the i-th target, Little Hi needs INi units of information. By assassinating a target Little Hi will obtain some information about the target’s direct superior. More specifically, the i-th target has IPi units of information about his superior. So in order to assassinate EL SUENO, Little Hi needs to assassinate some of his direct subordinates so that the sum of information obtained is enough; assassinating the subordinates needs to assassinate their direct subordinates … until it reaches some targets require zero information in advance. (Luckily if a target has no subordinate he always requires zero information.)

How Little Hi wants to know what is the minimum cost for successful assassinating EL SUENO.

输入

The first line contains an integer N.

Then follow N lines. Each line describes a target with 4 integers, Fi, INi, IPi, Ci.

Fi is i-th target’s superior. For EL SUENO his Fi is zero.

INi is the amount of information needed to assassinate the i-th target. For a target has no subordinates his INi is always zero.

IPi is the amount of information the i-th target has about his superior Fi.

Ci is the cost of assassinating the i-th target.

For 30% of the data, 1 <= N <= 10, 0 <= INi, IPi <= 100

For 60% of the data, 1 <= N <= 100, 0 <= INi, IPi <= 1000

For 100% of the data, 1 <= N <= 2000, 0 <= Fi <= N, 0 <= INi, IPi <= 20000, 1 <= Ci <= 1000.

It is guaranteed that the N targets form a tree structure.

输出

The minimum cost. If Little Hi is not able to assassinate EL SUENO output a number -1.

样例输入

样例输出

思路

这道题可以通过dfs来解决，对于每一个节点，若杀死该节点所需的信息量为0，则杀死该节点的最小代价为$C_i$，否则，首先递归计算其所有孩子是否能被杀死，若能，则计算杀死该孩子$child_i$所需要的最小代价$cost_i$。

若已经得知了所有孩子节点的信息，设能杀死的孩子节点集为$s$，杀死$s$中的孩子的最小代价集为$costs = {cost_{s_i}}$，可获得的信息量为$ips = {IP_{s_i}}$。则问题转换为从$ips$中选取一些元素使得

元素和大于杀死当前节点的$IN$值。
对应的代价和最小。

这是一个最小背包问题，可用动态规划解决。需要注意的是如果使用一维动态规划则数组遍历方式为从高到低，防止脏读。

代码如下：

#include <iostream>
#include <vector>
using namespace std;

#define INF 1000000000
long long max(long long a, long long b) { return a > b ? a : b; }
long long min(long long a, long long b) { return a < b ? a : b; }

class node {
public:
    node *p;
    vector<node*> childs;
    long long c, in, ip;
    node(vector<long long>& v) {
        in = v[1];
        ip = v[2];
        c = v[3];
        p = nullptr;
    }
};

long long dfs(node* root) {
    vector<long long> tab(vector<long long>(root->in + 1, INF));
    tab[0] = 0;
    for (auto c: root->childs) {
        long long costAll = dfs(c);
        for (long long j = root->in; j > 0 ; j --) {
            tab[j] = min(tab[max(0, j-c->ip)] + costAll, tab[j]);
        }
    }
    return tab[root->in] + root->c;
}

int main(int argc, const char * argv[]) {
    long long N;
    cin >> N;
    vector<vector<long long>> v(N, vector<long long>(4, 0));
    for (long long i = 0; i < N; i ++) {
        for (long long j = 0; j < 4; j ++) {
            cin >> v[i][j];
        }
    }
    // insert
    node* root = NULL;
    vector<node*> i2node(N, NULL);
    for (long long i = 0; i < N; i ++) {
        node* cur = new node(v[i]);
        i2node[i] = cur;
        if (v[i][0] == 0) {
            root = cur;
        }
        for (long long j = 0; j < i; j ++) {
            if (v[j][0] - 1 == i) {
                cur->childs.push_back(i2node[j]);
                i2node[j]->p = cur;
            }
            if (v[i][0] - 1 == j) {
                cur->p = i2node[j];
                i2node[j]->childs.push_back(cur);
            }
        }
    }
    // dfs
    long long ans = dfs(root);
    cout << (ans < INF ? ans : -1) << endl;
    return 0;
}

🏷 algorithm, C++,

记一次脑残的mongodb连接经历 »