窗外

Dockerfile实践

最近在玩OpenCV,顺手build了一个OpenCV 3.2.0的Docker image。这个image是基于Ubuntu 16.04和OpenCV 3.2.0的source code build的,顺带也build进了Python3的绑定。这个image比较适合用来作为开发和测试基于OpenCV的服务端程序环境的base image。由于包含了几乎全部的OpenCV组件,build的过程还是比较费时的,image的尺寸也比较
大,所以我将它push到了Docker Hub里。需要的话,可以用
docker pull chunliu/docker-opencv

把它拉下来使用。如果要精简组件,或build别的版本的OpenCV,可以修改Dockerfile,重新build。

实际上我以前并没有怎么用过Docker,只在虚机中安装过,顺着Docker官方的tutorial做过,并简单看过官方的几篇doc,仅此而已。大概明白Dockerfile是怎么回事,但没有写过很完整复杂的Dockerfile。事实证明,事非经过不知难,写这个Dockerfile还是有一些坑的。

首先,要写好这个Dockerfile,只靠记事本比较困难,使用辅助工具会容易一些。我用的是VS Code + Docker support,它能提供关键字着色和IntelliSense,也仅此而已。如果有工具能做语法检查就更好了,比如检查行尾是否少了一个续行符之类的。我开始几次都是跑build失败才发现,是某一行少了一个续行符。

另外,我没发现有什么好的方法,来debug和测试Dockfile。最开始,我是修改了Dockfile之后,就跑build,失败再找原因。但是这个build比较费时,这样不是很有效率。后来,我开始在一个container里,逐条跑Dockerfile里的命令,保证每条命令都没问题,再跑build。这样做的问题是,所有命令在一个bash session里跑成功了,并不能保证它们用RUN组织到Dockfile以后,build还能成功。

这就牵扯到Docker是怎么执行这个RUN的问题了。Docker的文档说,每一个RUN会是一个新的layer。我起初不太明白layer的含义,做过之后发现,所谓layer,就是一个中间状态的container。RUN后面的代码是在这个container里跑,跑完之后这个container被commit到image里,然后这个container被删除。后面的RUN,会基于新commit的image,起一个新的container。

所以,如果两段代码需要在一个bash session里跑的话,就需要在一个RUN里面才行。一个例子,比如build OpenCV的时候,会用下面的方式来make:

mkdir build
cd build
cmake ......
make ......

如果将cd,cmake和make分开到不同的RUN中,那cmake和make就有问题了,因为工作路径不对。实际上,RUN的工作目录是由WORKDIR设定的,每个RUN开始时都会使用它上面最靠近它的WORKDIR作为工作目录。所以如果非要将上面的代码分开到不同的RUN,也可以在RUN之间插入WORKDIR来指定路径,不过路径跳来跳去的,比较混乱。

细读Docker的两篇官方文档,对规避Dockerfile里的一些坑,是很有帮助的。

Reference

Open Live Writer

早年间Blog还流行的时候,微软出的Windows Live Writer是非常流行的一款离线写blog的工具。WLW我用过很久,一开始是和MSN Messenger一起的装的,主要是支持MSN Spaces。后来微软不支持Spaces了,但WLW还留着,因为它也支持Wordpress。又后来MSN Messenger被淘汰了,还是会通过Live Essential Tools安装里面的WLW。直到后来WLW2012之后,这个工具我还是用了很久,直到开发和支持都停止了。

WLW之后,我就再没用过桌面编辑器写blog了。主要是blog写的也少了,偶尔写一篇,就在浏览器里解决了。再者也没发现顺手的工具。直到今天在讨论组里看到有人提起,原来WLW有了开源的版本,而且还发布到了Windows Store里面。赶紧下了一个来试试。新的OLW界面和WLW一致,看起来不是UWP应用,像是通过Desktop Bridge包装了一下。OLW是.NET Foundation支持的,官网是http://openlivewriter.org/,代码开源在Github上。我已经fork了一份。它的readme还介绍了一段OLW的历史,蛮有趣的。

我以前用WLW的时候,给它写过插件。当时就想,有些功能它是怎么实现的。现在开源了,而且这么怀旧的玩具,有空的时候真要好好玩玩。

Ubuntu 16.04

前两天收到通知,说是我host在Azure上的这台VM,可以升级到Ubuntu 16.04了。趁着有空,就将它升了上去。

说起来,这台VM也经历好几次版本升级了。最初的时候,OS是Ubuntu 13.04。后来升级到13.10,再后来是14.04。每次升级都或多或少会遇到一些问题,要花些时间troubleshooting。因为怕麻烦,升到14.04之后就没继续折腾15.10,呆在14.04有两年多了。

因为之前升级的时候遇到过问题,我在今天升级之前还专门搜了搜,果然还是有不少人在升级16.04的时候遇到问题。为了以防万一,我觉着还是先做个备份比较保险。这时候就显出用Azure的好处了。Azure里有一个VMBackup服务,大大简化了云端虚拟机的备份和恢复操作,这个服务也支持Hybrid模式,可以将本地VM备份到云端。考虑到云存储非常便宜,这确实是个不错的功能。我之前就建了一个备份策略来保护这台VM,所以只要跑一下已经定义好的Job,备份就完成了。万一升级出了问题,恢复的话也是一个按钮的事。

备份好之后,我就开始跑升级。没想到还挺顺利的,除了mysql升级失败之外,没有遇到什么会导致升级失败的错误。查log之后发现,mysql之所以失败,是因为apparmor保护了一些路径。因为升级的过程中,我选择保留所有的旧的配置文件,这导致mysql需要访问的一些新的文件路径是被apparmor保护的。改了apparmor的设置,问题就解决了。fail2ban也遇到一样的问题,我在旧版里修改的jail rule有一个bug,但旧版忽略了,新版就出错无法启动。修了这个bug之后就好了。其他的服务都没有遇到问题,升级之后就立即可用了。

总之这次升级还蛮顺利的,一个上午就搞定了,我本来还打算搞一天的。

Crazyflie

Crazyflie 2.0
My Crazyflie 2.0

Azure Marketplace搞了一个叫做Super Human的推广活动,推广Azure Marketplace里的各种服务。这个活动推出了一些virtual labs,你如果正好对这些服务感兴趣,通过这些virtual labs,可以学习怎么在Azure中使用它们。

其实virtual labs不是我想说的重点。重点是,如果你成功做出了某个virtual lab的结果,会得到一个奖励。你可以选择得到一个为期3个月的Azure Pass,或者一个Crazyflie 2.0无人机。3个月的Azure Pass也许不错,可是我想大家应该都会选无人机吧?

所以,这个无人机才是我想说的重点,因为我做出了其中一个lab,并且拿到了无人机。这个活动的奖品只能寄到美国的地址,我辗转托了两个朋友才弄回新加坡。这个官网售价180美金的无人机,需要自己动手组装,上面的照片,就是组装好之后的样子,非常小巧。昨天装好之后,我拿去公园的空地上试了试,让无人机飞起来到没问题,只是操控还不是很熟悉,结果一个螺旋桨的塑料支架被我摔裂了。不过还好,包装里每个配件都提供多一个备用件,很好更换。

Build a SharePoint Server 2016 Hybrid Lab

SharePoint Server 2016 has been out there for a while. One big feature of it is the hybrid configuration with Office 365. To understand how it works, I built a lab environment based on Azure VMs and a trial subscription of Office 365. Here is how I did it.

Prerequisites

To build a lab environment for hybrid solutions, you need the following components in place.

  • An Office 365 subscription. A trial is fine.
  • A public domain name. The default <yourcompany>.onmicrosoft.com domain that you get from the O365 subscription won’t work in hybrid scenarios. You have to register a public domain if you don’t have one.

Configure Office 365

In order to configure the hybrid environment, you must register a public domain with your O365 subscription. The process is like you go to your O365 subscription and kick start a setup process. O365 will generate a TXT value. You need to create a TXT record in the DNS of your domain vendor with that value, and then ask O365 to verify it. Once the domain is verified, the domain is register with your O365 subscription successfully. More details can be found here.

You don’t need to create those DNS records for mail exchange such as MX etc. if you just want to test SharePoint hybrid scenarios. You only need to create them if you also want to test the mailbox features.

The next step is to configuration AD sync between your on-premise AD and the Azure AD created with your O365 subscription. You can configure the Azure AD Connect tool to do it. And for a lab environment, AD sync with password sync is good enough. You can also try AD sync SSO if you have an AD FS to play with.

Before kicking start the AD sync, you might have to do some cleaning on AD attributes. I changed the following:

  • Add a valid and unique email address in the proxyAddresses attribute.
  • Ensure that each user who will be assigned Office 365 service offerings has a valid and unique value for the userPrincipalName attribute in the user’s user object.

With the cleaning done, you can start to sync the AD. You should be able to see users account in the O365 admin center after syncing.

Configure SharePoint Server 2016

Deploy the SharePoint Server 2016 farm. You can try the MinRole deployment if you have multiple servers. In my lab, I just deployed a single server.

The following service applications are required for the hybrid scenarios.

  • Managed Metadata Service
  • User Profile Service with user profile sync and MySite host.
  • App Management Service
  • Subscription Settings Service
  • Search Service for hybrid search scenario

The user profile properties need to have the following mapping:

  • User Principal Name property is mapped to userPrincipalName attribute.
  • Work email property is mapped to mail attribute.

Configure Hybrid

Once you have the O365 and SharePoint Server 2016 ready, you can start to configure the hybrid. It is fairly simple with the help of Hybrid Picker of SharePoint Online. You just need to go to SharePoint admin center of O365, click configure hybrid and pickup a hybrid solution, follow the wizard. If everything is ok, you will get the hybrid configured. Browse to an on-premise site, and you should see the app picker like the screenshot below.

Next Step

Next thing to try is to configure the server to server trust and the cloud hybrid search. Stay tuned.

 

谁还需要移动硬盘?

项目里之前在Azure上搭了一个测试环境,跑着几台虚拟机。前阵子项目告一段落了,想着把这个Azure上的环境停掉,省点钱。又想着这几台虚拟机搭起来也费事,还是留个备份比较好,不如把它们的vhd全部下载下来,保存在移动硬盘上。

这几个vhd加起来大概需要800GB的磁盘空间。我就从家里翻出一块许久没用的WD MyPassport硬盘,USB3.0的接口,1TB的容量,大概是2013年买的,一直没怎么用过。我心说这次派上用场了。谁知插到电脑上一试,坏掉了,Windows可以认到硬盘,但不能mount,容量显示0。网上查了半天,没找到修复的方法。

同事知道了,把他的一块全新未开封的硬盘借给了我,说是称打折买的,一直也没用。他的这块硬盘到是好的,但是我发现,从Azure Storage上下载800GB的数据,所花的时间太长了,以后真要重建这个环境的话,还要花更长的时间上传,根本不划算。

最后,我根本没用移动硬盘,而是将这些vhd用AzCopy备份到另一个Azure Storage里面了。用AzCopy的异步拷贝,这样备份很快,以后要重建也很容易。关键是存储几乎是Azure里最便宜的服务,1GB每个月只要2.4美分,又没有数据丢失的风险,比下载下来备份到移动硬盘里安全多了。

我不禁想,这以后谁还需要移动硬盘呢?市面上已经有了各种云存储服务,普通用户足够用了,不用担心磁盘损坏或者换系统换机器时数据丢失,而且长远看云存储的价格会越来越低,移动硬盘快被淘汰了,可能连这个概念大概就快消失了吧?

Mini PC开箱

前一阵子偶尔看到Scooter Computer,觉得蛮有趣的。家里不缺路由器,到是正好缺台性能够用的电脑。之前我一直在看笔记本电脑,但是性能尚可的笔记本,价格至少过千新币。这Mini PC的配置看来正适合我的要求,就订了一台。等了两个星期,昨天货送到了。送货的周期有点长,但是免费的。

我订的是i7 + 8Gb Ram + 128GB SSD的版本,美金320多块。卖家还是蛮良心的,内存和MSSD用的都是单条,大厂品牌(三星的内存,东芝的SSD),还各留了一个插槽可供继续扩展。不像那些笔记本厂家,总是用小容量内存把所有插槽插满。你想扩展的时候,发现换下来的内存放着没用,扔了可惜。

我有一块闲置的240GB Intel 2.5″ SSD,装在SATA接口上正合适。趁着上个周末的IT Show,我还淘了一块性价比不错的Dell显示器,和一个BOSS SoundLink Mini II,搭配在一起正合适。经过这两天的测试,机器蛮不错的,所有驱动Windows 10自动识别安装,没有杂牌配件需要自己找驱动的情况。没有噪音,铝壳发热很低,也可能是上网听音乐,Load不重的关系,回头再试试视频播放的情况。WiFi模块带蓝牙,可以适配各种蓝牙设备。稍显美中不足的是,WiFi模块用的Broadcom的802.11n,单频不支持5GHz。不过看在价钱的份上,也很难要求更多了。综合起来,我还是相当满意的。

另外值得一提的,就是AliExpress了。这是我第一次在AliExpress上买东西,体验跟淘宝一致,很有希望做大啊。

Stack Overflow架构,2016版

Nick Craver在他的blog上发了一篇2016版的Stack Overflow架构,其中的几组数据,有其令人印象深刻:

网站的统计数据:

  • 209,420,973 (+61,336,090) HTTP requests to our load balancer
  • 66,294,789 (+30,199,477) of those were page loads
  • 1,240,266,346,053 (+406,273,363,426) bytes (1.24 TB) of HTTP traffic sent
  • 569,449,470,023 (+282,874,825,991) bytes (569 GB) total received
  • 3,084,303,599,266 (+1,958,311,041,954) bytes (3.08 TB) total sent
  • 504,816,843 (+170,244,740) SQL Queries (from HTTP requests alone)
  • 5,831,683,114 (+5,418,818,063) Redis hits
  • 17,158,874 (not tracked in 2013) Elastic searches
  • 3,661,134 (+57,716) Tag Engine requests
  • 607,073,066 (+48,848,481) ms (168 hours) spent running SQL queries
  • 10,396,073 (-88,950,843) ms (2.8 hours) spent on Redis hits
  • 147,018,571 (+14,634,512) ms (40.8 hours) spent on Tag Engine requests
  • 1,609,944,301 (-1,118,232,744) ms (447 hours) spent processing in ASP.Net
  • 22.71 (-5.29) ms average (19.12 ms in ASP.Net) for 49,180,275 question page renders
  • 11.80 (-53.2) ms average (8.81 ms in ASP.Net) for 6,370,076 home page renders

目前的架构:

  • 4 Microsoft SQL Servers (new hardware for 2 of them)
  • 11 IIS Web Servers (new hardware)
  • 2 Redis Servers (new hardware)
  • 3 Tag Engine servers (new hardware for 2 of the 3)
  • 3 Elasticsearch servers (same)
  • 4 HAProxy Load Balancers (added 2 to support CloudFlare)
  • 2 Networks (each a Nexus 5596 Core + 2232TM Fabric Extenders, upgraded to 10Gbps everywhere)
  • 2 Fortinet 800C Firewalls (replaced Cisco 5525-X ASAs)
  • 2 Cisco ASR-1001 Routers (replaced Cisco 3945 Routers)
  • 2 Cisco ASR-1001-x Routers (new!)

Nick说他会写一个系列,非常期待后续的详细介绍。

 

Remote Desktop for Linux VM on Azure

Usually you don’t need remote desktop or VNC on Linux servers running in the cloud. But as I wanted to try some scenarios with a Linux desktop and I actually don’t have a physical machine loaded with any Linux OS, I ended up setting up a Ubuntu server on Azure and enabling the remote desktop on it.

Obviously, I am not the first one who want to use remote desktop on servers running on cloud. There are plenty of posts on the internet talking about how to do it. Most of them are about using xrdp + xcfe4, including this one for Azure VMs. I am using Ubuntu 15.10 image. The only gotcha is that running the following command could uninstall the waagent service.

$sudo apt-get install ubuntu-desktop

This is a known issue that you can track on the github.com. To get waagent back, you have to reinstall it with the following:

$sudo apt-get install --reinstall walinuxagent

I ended up not installing the Ubuntu-desktop. Without it, you also avoid installing applications that you don’t need, such as those Office software.

Although xcfe4 is good enough as a lightweight window management system, I am more used to a GNOME like desktop environment. So I decided to try the MATE desktop. The configuration of it is very easy. Just run the following:

sudo apt-get install xrdp
sudo apt-get install mate-desktop-environment
echo mate-session >~/.xsession
sudo service xrdp restart

You may have to reboot the server after installing the desktop environment.

That’s all. I am using it now and so far so good.

Linux VM on Azure: A Mail Server

I need a mail server which can serve emails with my own domains for a small group of users. I don’t want to go with Exchange server as it is too heavy. With some search on the internet, I decide to setup a small Linux node with Postfix on Azure. Such a small node only costs about $40/month which can be covered by my MSDN subscription, and it is good enough for the purpose.

I choose to setup the VM with Ubuntu server image. There are a lot of posts online about how to setup Postfix on Ubuntu. Particularly, I followed this one as I also need the virtual domain and virtual mailbox. It is quite clear and easy to follow, and I got the Postfix and Dovecot up and running by following it.

The missing parts of the above post are the anti-spam and anti-virus portion. Fortunately, Amavis-new + Spamassassin + Clamav make things a lot more easier. The Ubuntu help page here is good enough for the purpose.

With all these setup, I have a mail server which can send and receive emails. However, when I try to send emails to those big mailbox hosts like gmail or outlook, my emails are rejected as they don’t accept emails from dynamic IP range, which is unfortunately used by the public cloud vendors like Azure and AWS. So the only way to workaround it is to relay emails to a mail delivery service. With Azure, we can leverage SendGrid. SendGrid has a free plan for Azure accounts of which the quota is 25,000 / month. So I create a SendGrid account in Azure, and configure Postfix to relay emails to it. SendGrid has a short guide for it.

The only problem I encounter with the relay is that Amavis-new throws an error “TLS is required, but was not offered
by host”. With some search, I find the workaround here. After fixing it, the emails can be accepted by gmail and outlook happily.